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(54) TlUe: METHOD AND DEVICE FOR PROCESSING A MULTI-CHANNEL SIGNAL FOR USE WITH A HEADPHONE 




(57) Abstract 

A medKxl and device processes multi-channel audio signals, each channel conespondlng to a loudspeaker placed in a particular 
location in a room, in such a way as to create, over headphones, the sensation of multiple '"phantom" loudspeakers placed diroughout 
the room. Head Related lYansfer Functions (HRTFs) are chosen according to the elevation and azimuth of each intended loudspeaker 
relative to the listener, each channel being filtered with an HRTF sudi that when combined into left and right channels and played over 
headphones, the listener senses that the sound is actually produced by phantom loudspeakers placed throughout the ^Virtual" room. A 
database collection of sets of HRTF coefficients frcHn numerous individuals and subsequent matching of ttie best HRTF set to the individual 
listener provides the listener with listening sensations similar to that which the listener, as an individual, would experience when listening 
to multiple loudspeakers placed throughout the room. An appropriate transfer function applied to the right and left channel output allows 
the sensation of open-ear listening to be experienced through closed-ear headphones. 
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METHOD AND DEVICE FOR PROCESSING A MULTICHANNEL 
SIGNAL FOR USE WITH A HEADPHONE 

Backgroun d of the Invmtion 
S FiddoftfaelnveniicHi Tliepresottinventicmidates toainethc)dandd^ 

a]iiuM<ha]indai]dbsigEudforiiq)roduct^ In particular, the present inventicm 

relates to an apparatus and method for creating, over headphones, the sensation of multiple 
"^phantom** loudspeakers in a user matdied virtual listening environment 

10 TUi^Jf g^Y^m^ Information . In an attempt to {mvide a num realistic or engulfing listen^ 

experience in the movie theater, several oanquBues have developed multi-channel audio formats. 
Each audio channel of the multi-diannel signal is routed to one of several loudspeakers distributed 
throughout die theat^, providing movie-goers with the sensation that sounds are originating all 
around them. At least csi& of these formats, for example tte Dolby Pro Logic® fcxmat, has been 

15 adqptedfo use in the hooieentertaimnent industry. The Dolby Pro Logic® format is iiow in wide 
use in home theater systons. As with the theater version, each audio chaimel of the muhi-channel 

gj gyifll T^^H nf gftirtynl Ir^i^igyylffff^ plnnpd pmmA thft mnm^ pmviding hnmfi lifitePCTS with 

the sensation that sounds are ongmating all around theuL As the home entertairunent system market 
oq>ands, other multi-diaimel ^stems will likdy become available to home consumers. 

20 When humans listen to sounds pnKiuoed by loudspeakers, it is termed q>e^ 

Opca<8r listening occurs when tfse ears are uncovered. Itistheway we listm in everyday life, hi 
anopeD^emaromnent^tfae SGoicmfonnation arriviiig at the ears provides cues about the locaticm 
and distaiceoftfae sound source. Humans are able to localize a sound to the ri^ or kft based on 
di£Eerences in the arrival times and difiierences in the sound levels at the two ears. Other subtle 

25 difi&renoes in the spectrum of die sound at each ear drum provide cues about the sound source 
elevation and front/bade location. These difibreoces are related to the filtering effects of severd 
body parts, most notably die head and the pinnae of the ears. 

The process of listening vrfiile the outer ear surfieice of the ear is covered (e.g., with 
headphones) is termed closed-ear hstening. Covering the ear changes die ear canal res(»iance 

30 characteristics. Due to the physical effects of wearing hcac^hones, sound deUvered throu^ 
headphones lacks the subtle difiBsrences in tin^ level, and spectra caused by location, distance, and 
die filtering efGDCts of the head and pinna CTqierienoed in open-e^ Thus, vibexi headphones 

are used widi multi-channel hcane entertainment systems, die advantages of listming via numerous 
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loudspeakers placed duou^iout the room are lost, the sound often appearing to be originating inside 
the listen^s head 

There is a need for a system that can process multi-channel audio in such a way as to cause 
the listener to sense multiple ''phantom'' loudspeakers when listening over headphongs Sudi a 
5 system should process each duumel such that the effects of loudspeak^ location and distance 
intended to be created by each cbannd signal, as well as the filtering efiE^ 
pinnae are pressed or simulated accurately for that individual listeoa. 

Aco(mlingIy» an object of the present invention is to provide a method for processing the 
• multi-diannel ou^ut typically produced by home entertainment or like systems such that when 
1 0 presented over headphones, the listener is able to select a best match set of head rdated transfer 
functions fiom a database of measured head related transfer 

the listener experiences the sensation of multiple ""phantom** loudspeakers placed throughout the 
locHn. 

Another object of the present invention is to provide an apparatus for processing the multi- 
1 5 diannel output typical^ produced by home mtertainment or hke systems such that whoi presented 
over headphcmes, the listener expai&acGs listening sensations most like that which the listener, as 
an individual, wouU experience when listcui^ to mult^te 1« 

Another object of the present invention is to provide an apparatus for processing the multi- 
dumncl output typically produced by home entertainment or like systems sudi that when presented 
20 over headphones, die listener experiences sensations typical of open-ear (unobstructed) listening. 
Anodier object of the present invention is to provide an qipara^ 
the acoustic filtering action pnxiuced Iqr the head and pinnae of the human ears so as to p^^ 
usefiil database of head related transfisr functions. 

Another object of the present invention is to mate a database of HRTFs rq)resentati ve of 
25 the general listening public by meaairing and recording a large enough set of sudi HRTFs such that 
aiQr given individud is likdy to be abte to select a set of HRTFs fir^ 

to process an audio signal the user perceives the corresponding sounds to be localized in the proper 
spatial positions. 

Anotho* object of the present invention is to provide a means of detennining the "best- 
30 matdi'* of an individual listener to cHie of the HRTF sets of the representative database such that the 
individual listener can be matched as ctosdy as possibte to an alreacty m^^ 
in a database, such that once proper^ matdied, the individual wiMcaqpm^ 
locations of die sources of the listening system 
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Anodia object of the pfescDtinventioii is topmvide a vnredor wireless transmission system 
for dimensionalized listening of sound over headphone. 

Other objects of the invention will become clear from a review of the complete disclosure. 

^iimm ^ of the Invention 

5 According to the present inv^on, multiple channels of an audio signal are processed 

thiougli the iq)plication of filtering using a head related transfer function (HRTF) or a plurality of 
HRTFs, selected by a user, sudi that vdien reduced to two channels, left and ri^t, each channel 
ooQtams mfonnation that enables die listener to sense the location of multiple phantnn loudspeakers 
when listming over heac^hones. 

10 Also according to the present invention, multiple dumnels of an audio signal are processed 

through the qiplicaticm of filtering using HRTFs chosen torn a large database such that ^^len 
Ustening through headphones, the listener e9q)erienoes a sensation that most closely matdies the 
sensation the listener, as an individual, would experience vAen listening to multiple loudspeakers. 
In another exemplary embodiment of the present invention, the rig^ and left channels are 

15 filtered in order to simulate the effects of open-ear listening. 

In anodier exenq>laiy embodiment of die present invention, a canspletc set of HRTFs for an 
individual is measured and recorded, such that the measured HRTFs are an accurate reflection of the 
filtering efiGxts of that individual's head and pinnae, and in ^licb the measu^ 
ofa&wminutes. For each individual, several hundred HRTFs are measured such that an HRTF is 

20 qsedfiedfisreacbtocationinspaoeaboutthelistenerwithanaocuracyofappr^ 10"" inboth 
the vertical and horizontal dimensions. 

In a fiather embodiment of diis invention, the HRTFs of a sufficient mmdm- of individuals 
are measured and stored to create a database sudi that a givoi individual is able to select a set of 
HRTFs from the database such Aat vAun audio signals are processed with the selected set of 

2S HRTFs, die u ser peroe i ves die corresponding sounds to be locahzed in the pn^ positions. 

In a fimher embodiment, the database of HRTFs conqirises a rqiresentative set of HRTF 

sets. 

In anodier exemplary embodiment of the pres»t invention, an individual is matched to a 
"best-match"" set of HRTFs selected from a database of sets of HRTFs measured from a 
30 repre se nta tive sanq)le of the general listening population, where the individual listener participates 

tn riift fnflt^l^ing nf fhft nf HRTFs hy ofmyaring the pemgitinn created by diCFefent HRTF md 

selecting die HRTF set providing die best spatial percepticm. 
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In another exemplary embodiment of the present invention, a database of HRTF sets, 
measured from a rq)resentative sample of the listoiing population, is established, such that an 
individual can select a ''best-match'' s^ of HRTFs fiY>m the database, 
h a fiirther embodimBat a best match set of HR^ 
S and is used to process signals for wired or wireless transmission to a listener wearing headphones . 

BrirfPwPriptiOT qf Dm^yinfff 
Figure! is a rqireseotation of sound waves received at botti ears of a listener sitting in a 
room with a typical multi-diannel loudspeaker configuration. 

10 Figure 2 is a rq;>resentation of the listening sensation experienced through headphones 

according to an exemplary onbodiment of the presmt invention. 

Figure 3a shows the sound source locations used to measure a set of head related transfa 
functi<ms (HRTFs) obtained at nudtiple elevations and azimuths surrounding a Ustener. 

15 

Figure 3b is a grqsh rqrosentir^ the HRTF for 0 d^tees elevati 
fcx- three difiGerent individuals. 

Figure 4 is a schematic in blodc diagram form of a typical nuilti-diannel headphone 
20 processing system accordiiig to an exenq>laiyembodiiiient of the present iriventio^ 

Figure 5 is a schematic in block diagram form of a bass boost circuit according to an 
exenqilary embodiment of the present invention. 

25 Figure 6A is a schmatic in blodc diagram form of HRTF filtoing as q)pUed to a single 

i*hftnn^l according to an exemplary embodimoit of the present invention. 

Figure 6B is a sdiematic in block diagram form of the process of HRTF matching based 
on an ordered set of HRTFs according to the present invention. 

30 

Figure 7 is a representation of a ^ical digital signal transmission system con^irising a 
transmitting station, a conneding medium called a channel and a receiving statim. 
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Figure 8A is a blodc diagram of a novel radio-fiequency transmission system for use in a 
wireless embodiment of this invention. 

Figure 8B is a represotation of an ad^tive filter for removing the DC component of a 
S digital signal. 

Figure 9A shows a CQnq>uter simulated iq)ut gaussian noise source with a variance of 2.S 
mV and a mean of 0.S V. 

10 Figure 9B shows the tracking constant, C[k], during a computer simulatiw of die removal 

of the DC component of an ixq>ut gaussian noise source by an ad^tive filter. 

Figure 9C sixiws flse output of an adaptive fito where the input is a ga 

IS Figures 9D and 9E show the nutgnitude frequency response of the iapul gaussian noise 

waveform and DC shifted output. 

Figure 9F is a schematic of a state madiine. 

20 Figure 9G is a timing diagram of various dock outputs for decoding signals encoded 

according to one embodiment of this invention. 

Figure 10dq)icts an HRTF matching process according to the present invention. 

25 Figure 1 1 shows an inqwlse response wave torn lecoided fixan ooe individual at one spatial 

locaticm for (Hie ear. 

Figure 12 illustrates critical band filtering according to the present invention. 

30 Figure 13 iUustrates an exemplary subject filtered HRTF matrix aoc^ording to the present 

invention. 

Figure 14 ilhstrates a Iqpc^ietical hierarchical agglomerative dustmng procedure in two 
dimensions according to the present invention . 
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Figure 15 illustrates a hypothetical hierarchical agglomerative clustering procedure 
acccffding to an exemplaiy embodiment of the present invention. 

Figure 16 is a sdiematic in block iiaffm fSom of a typical reverberation processor 
constructed of parallel lowpass comb fihers. 

Figure 17 is a sdiematic in block diagram foim of a typical lowpass comb filter. 

Figure 1 8a is a sdiematic of a preferred embodiment of an HRTF measurement means. 

Figure 18b fiirther illustrates a preferred embodiment of an HRTF measurement means. 

Figure 1 9 is a schematic representation of the HRTF measurement control system. 

IS fjpiyy 7^ ft yiiPtiMftn wyt ipyntfltOTi ftf Aft HRTF measuremmt control system software 

flowchart. 

Fjgure j!1 A a y) wi« fl t^<^ ffyrmit y^*^ « ftnnt view of a sound room in wdiich HRTFs 
may be measured to produce die database of HRTFs of this invention. 

20 

Figure 21B is a schematic representation of a top view of the sound room. 
Figure llCshows the detail of the cross section of the wall of the sound rocmt 

25 F^ure 22A shows the probability that the RMS distance, between any individual's HRTF 

and the nearest HRTF already in the database, is less dian a certain RMS dis 
of the number of HRTF sets in the database. 

Figure 22B shows the cumulative density function of the distance between each of ISO 
30 HRTFs and the mean HRTF. 

Figure 22C shows the change in average mean as a function of subsample group size. 
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Figure 22D shows the diange in average standard deviation as a function of subsample 
group size. 

Figure 22E shows the mean minimum distance between any HRTF set of the ISO HRTF 
S sets and one of the stored HRTF sets as a fimctim of the number of stored HRTF sets . 

Figures 23A, B, C are block diagrams of a circuit acconling to this invention for processing 
signals using a best match set of HRTFs sdected by a user from die database of tfak 

10 Figure 24 is a detail of an early reflection processing circuit 612 according to Figure 23. 

Figure 25 is a detail of an HRTF processing circuit 663 according to Figure 2 3 comprising 
finite iinpulse response filters that inq>leinent HRTFs sel^^ 

15 Figure 26 is a detail of a revoberation circuit 671 according to Figure 23. 

Figure 27 is a d^ail of a bass boost processing circuit 670 according to Figure 23. 

Figures 28At B» C are a schematic representation of tl^ HRTF selection and matching 
20 performed by a user to arrive at a best matdi set of HRTFs whtdi is then used for processing of 
audio signals accordmg to Figures 25 and 23. 

Figure 29At B is an alternate embodiment to that disclosed in Figures 28 A, B, and C. 

25 

Drtailed Description of the Invention 
The medKxl and device aoocKding to the present invention processes audio signals, including 
multi-channel audio signals having a plurality of channels, each corresponding to a loudspeaker 
30 placed in a particular locaticm in a room, in such a way as to create, over headphone, the sensation 
of multiple ""phantom** loudspeakers placed throu^ut the room. The present invention utilizes 
Head Related Transfer Functions (HRTFs) that are chosen according to the elevation and azimuth 
ofeach intended loudspeaker rdative to the listener, each channel beii^ filtered by a set of HRTFs 
such that when combined into left and right diannels and played over heac^hones, the listener 
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senses that the sound is actually produced by phantom loudspeakm placed throughout the ""virtual" 
room. 

The filtering of the present invention utilizes a database collection of sets of HRTFs 
measured fixm numerous individuak and subsequem matching of ^ 
5 listBoer* thus providing the listaerMOth list^ sosaticms similar to that which the listener, as an 
individual, vrouM experience when listening to multiple lou^^ 

Additionally, the present invention utilizes an appropriate tnmsfa function qiplied to the right and 

left dumnd output so tfmt tibe senstiim of open<ar listening may be a 

headphones. 

10 h generatiqgflie database coUectton of sets of HRTFs, the present invention also provides 

a measurement device and method for measuring and recording complete sets of HRTFs of subjects 
fi'om a representative sample of the listening peculation, such that the measured HRTFs arc an 
accurate reflection of the filtering effects of the head and pinnae of eadi of the subjects measured. 
For each individual, as many as 360 HRTFs f^ each ear mi^ be measured, with eadi HRTF 

IS depending on the position or location of the sound source with respect to ihc listener. These 

measured HRTF sets are stored in a database, sudi diat die database provides HRTF sets fi^ 
az9^ individual can select a set of HRTFs such dutt v^ audb sigD^ 

set of HRTFs, the user perceives the corresponding sounds to be localized in the imper spatial 
positions, to thereby achieve optimized 3D virtual audio effects vibsn using headphones. 

2 0 Figure 1 depicts the padi of sound waves received at both ears of a listener aocmiing to a 

typical embodtmem<tf a home entotainmeot system The multi*channel audio signal is decoded into 
multiple channels, i.e., a two-duuuiel encoded signal is decoded into a multi-channel signal in 
accordance with, for exaniple, the Dolby Pro Logic® fonnat Eadi channel of die multi-diannel 
signal is dien played, for example, dmw^ its associated loudspeato*, e.g. , one of five loudspeakers : 

25 left; ri^t; centa; left surround; and right surround. The effect is the sensation dmt sound is 
originating all around the listener. 

Figure 2 dqiicts die bstening e>q)erience created by an exemplaiy embodin^ 
inventioa As described in detail with respect to Figure 4, the present invention processes each 
dumnd of a multi-channel signal using a set of HRTFs appropriate for the distance and location of 

30 each phantom ta idspeaker (e.g. , the intended loudspeaker for each channel) relative to the listener's 
left and right ears. M resulting left ear channeU are summed, and all resulting right ear channds 
are summed producing two channels, left and ri^ Eadidiannel is then preferably filtered using 
a transfer fimction that iiitroduces the efifects of open-ear listening. When the two channel output 
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is presented via headphones, the listener senses that the sound is mginating fiom five pfaantcun 
loudspeakers placed throu^iout the room, as indicated in Figure 2. 

The manna in \^ch the ears and head filta sound may be described by a Head Related 
Transfer Function (HRTF). An HRTF is a transfer function obtained from (»ie ind^ 
5 ear for a specific sound source location. An HRTF is des^bed by muUple coefiBcients tiiiat 
eharactorize how sound produced at a particular spatial position should be filtered to simulate the 
filtfripgf ffi^ t of the head and ftuter ear of a particular individual. HRTFs are typically measured 
at various devations and azimudis. Typical HRTF measurement locations are illustrated in Figure 
3A, 

10 In Figure 3A, the horizontal pkme located at the center of the listener's hea^ 

0.0° elevation. The vertical plane extendmg forward fim the center of the head 1^ 
azimuth. HRTF locations are defined by a pair of elevation and azimuth coordinates and are 
represoted by a small sphere 110. In one embodiment ofthisinventim, HRTFs are measured in 
1 0 d^ree intovals fior the azimuth and 1 0 degree intervals for the elevaticm firom 3 0 degrees bdow 

15 the horizon to 60 degrees above the horizoa Associated with each sphere 110 is a set of HRTF 
coeffidents that represent the transfer function for that sound source locati Each sphere 110 is 

actually associated vith two HRTFs, one for each ear. 

IVy ^i ^^ ftfttwt himMing hflw havcHRTFs whidi 

are exactly alike. This fact is demonstrated in Figure 3B ^ch shows a graph represmting the 

20 HRTF for 0 degrees elevation and 30 degrees azimuth fcv three diffo^ Ascanbe 
seen, eadiofdieseindividuab has (pntedifbent HRTFs. Therefore, for each individual, it is critical 
to use a sfft of ff^TFs for filtering audift jgigpak fflMA that when the audio signals are filtered, the user 
perceives the corresponding sounds to be localized in the proper positions, in order to optimally 
create the sensariop that die particular signal originates fifom the location which is intoided by the 

25 HRTF processing. There have bem some effiats to use a *\miversd^ set of HRTFs, who 

user is presented with tiiesanie set of HRTFs, having some average char^ However, as (»ie 
can see fiom Figure 3B, a "^mivo^al"' set of HRTFs would give very different sensations to each of 
the three individuals depicted. For instance, if an individual's HRTF had a peak (or valley) at a 
fi^equency f, vMe the univosal HRTF had a contradictory valley (or peak) at die same fiequency 

30 i; the individual would interpret the directional cues of the signal incorrectiy. These inaccurate or 
poorfy matched HRTFs degrade the overall 3D percepticm of the individual, the amount of 
d^FBdationdependii^ on the individual This was Gq)erimentaUy demonstrated by Wiglitman and 
Kistier(1993). 
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In Ofder to mq)rove perfcmnance beyond the use of a single or '"universal" HRTF, and to 
overcome the uiq)racticaiities of measuring an individual set of HRTFs fix each individual, the 
present invention provides a database of HRTFs collected from a measured group of the general 
populatim. For exanyle, the HRTFs are mllftdflri ftom numeraus indi wfiifl l s of both Sflxcs with 

5 vaiying physical cfaaiBCteristics. The present invention then enqilc^ a unique process whereby^ 
sets of HRTFs obtained from aU individuate are organized into an 0^ 
read ally memoQr^M) or olfaer storage device. An HRTF matching processor enables eadi user 
to sdect, from die sets of HRTFs stored in the ROM, a set of HRTFs sudi that when audio signals 
are processed with the selected set of HRTFs, the user perceives the corresponding sounds to be 

10 localized in the proper spatial positions. 

An exenq)lary embodiment of the present invention is illustrated in Figure 4. After the 
multi-diannel signal has been decoded into its constituent channels, for example channels 1, 2, 3, 
4 and S in the Ddby Pro L(^;ic® &nnat, selected dia^ 

circuit 6. For example, duumels 1, 2 and 3 are processed by the bass boost circuit 6. Output 

IS channels 7, 8 and 9 from the bass boost circuit 6, as well as channels 4 and 5, are then eadi 
electronically processed to create the sensaticm of a phantom loudspeaker f(v ^ 

Processing of eadi channel is acconqilished through digital filtering using sets of HRTF 
onefficiiaits, fcr example via HRTF processmg circuits 10, 11, 12, 13 and 14. The HRTF processing 
dreuite can include, tor example, a suitabfy programmed digital signal processor. A best match 

20 between the listener and a set ofHRTFs is sdected via the HRTF matd^ Basedm 
the best match set of HRTFs, a prefisrred pair of HRTFs, one for each ear, is selected tor each 
channel as a function of the mtended loudspeaker position of eadi channel of the multi*channel 
signal. In an cxeasplaiy embodimat of the present invoitioD, the best match set of HRTFs are 
sdected from an ordered set of HRTFs stored in ROM 65 via ^ HRTF matching processor 59 and 

25 routed to the qspropriate HRTF processor 10, 11, 12, 13 and 14. 

Prior to die listener sdecting a best match set of HRTFs, s^ of HRTFs stored in the HRTF 
database 63 are processed by an HRTF onkring processor 64 such that thqr may 
65 in an ofderad sequmce to optimiae the matdiing poceag via HRTF matfihing procwff^W 5^ Once 
die optimal pair of HRTFs for each channel have been selected by the listener, sq>arate HRTFs are 

30 applied fisr the right and left ears, converting each input diannel to dual channel output 
Each dunmd ctf die dual diannd output from, for example, the HRTF pn)^^ 
is multiplied by a scaling fa^ as shown, for example, at nodes 16 and 17. This scaliqg fietttor 
reflects signal attenuation as a function of the distance between the phantom loudspeaker and the 
listeno^s ear. All rigiht ear diannels are summed at node 26. All left ear channels are summed at 
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node!?. The outinitofiiodes 26 and 27 results in two dia^^ 

v/faich contains signal infonnation necessary to provide the sensation of left, right, center, and rear 
loudspeakers intended to be aeated by eadidiannel of the multi-channel signal, but now configured 
to be presented over conventional two transducer heac^hones. 
5 Additionally, paralld revert)eration jHOoessing may optionafly be perfonned on one or more 

channels by reverberation circuit 15. In a fi:ee-field, the sound signal that reaches the ear inclii d(?s 
information transmitted directly from eadi sound source as well as infmnation reflected off of 
sui&oessudi as walls and oeiliqgs. Sound mfiannation that is reflected off of surfiBoes is ddq^i^ 
its arrival at the ear relative to sound diat travels directly to the ear. In cider to simulate surfiioe 

10 reflection, at least one channel of the multi-channel signal would be routed to the reveiberatioQ 
circuit 15, as shown in Figure 4. 

In an exemplaiy embodiment of the present invention, one oc more channels are routed 
dirau^thereverimBtionciiGuit 15. The circuit 15 inchides, for example, numerous lowpass ocnib 
filters in parallel configuration. This is illustrated in Figure 16. The iiq)ut channel is routed to 

IS lowpass conb filters 140, 141, 142, 143, 144 and 145. Eadiof these filters is designed, as is known 
in the art, to intnxiuoe the ddays associated with reflection off of room surfa^ The output of the 
towpass comb filters is summed at node 146 and passed through an allpass filter 147. Tfaeoutput 
of the allpass filter is separated into two channels, left and right A gain, g, is applied to the left 
channel at node 147. An inverse gain, -g, is applied to the ri^dumnd at node 148. Thegaing 

20 allows the relative proportions of direct and reverberated sounds to be adjusted 

Figure 17 ilhistralBS an exemplary embodiment of a lowpass comb filter IM^ Theinputto 
die ccmb filter is summed with fihered output fiom the ccanb filter at n^ The sununed signal 
is routed dirou^ the ccmib filter 151 where it is delayed Dsanqiles. The output oftheccmb filter 
is routed to node 146, shown in Figure 16, and also summed with feedback fiom die lowpass filter 

25 153 k»p at node 152. The sunnned signal is then iiqiut to the lowpass filter 153. The output of the 
towpass filter 153 is then routed bad^ throu^ both the comb filter and the lowpass filter, with gms 
applied of g, and g, at nodes 154 and 155, respectively. 

The effects of q)ai-ear (non-obstructed) resonation are optionally added at circuit 29 in 
Figure4. The ear canal resonator aoomding to the present invenrinn ig rfegigff^ to simulate (yen-far 

30 listening via headph on es by introducing the resonances and anti>resonanoes that are characteristic 
of open-ear listening. It is generally known in the psychoacoustic art that open-ear listening 
introduces certain resonances and anti-resonances into the incoming acoustic signal due to the 
filtering effects oftbe outer ear. Ite characteristics ofthese resonances and anti-resonances are also 
generally known and may be used to construct a generally known trans^^ 
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open ear, transfer function, that, when convolved with a digital signal, introduces these resonances 
and anti-resonances into the digital signal. 

Open-eariesonaticHi circuit 29 compoisates for the effe^ introduced by obstructicm of the 
outer ear via, for example, headphones. The opm ear transfer fimction is convolved with each 

S channel, left and right, using, for exaniple, a digital signal processes. The output of the open^ 
resooatioQciicuit 29 is tvvo audio channels 30, 31 that vto delivered ttuou^ headphones, sinudate 
the listener's multi-loudspeaker hstraing experience by creating the sensation of phantom 
loudspeakers throughout die smuilated room in accordance with loudspeaker layout provided by 
format of the multi-diannd signal. Thus, the ear resonation circuit accoiding to die present 

10 invention allows for use with any heac^hme, therdiy eliminating a need for uniquely designed 
headphones* 

Sound delivered to the ear via headphones is typically reduced in ampUtude in the lower 
frequencies. Low frequency »ergy nuiy be increased, however, through the use of a bass boost 
system. An exemphoyembodimentof a bass boost circuit 6 is illustrated in Figure 5. Ou^iutfiom 

IS selected dianneU of the multichannel system is routed to die bass bo^ Lowfiequoicy 
signal tnfomuttion is extracted by perfonning a tow-pass filter at, for example, 1 00 Hz on one or 
nxxe channels, via low pass filter 34. Once the low frequency signal infomiation is obtained, it is 
multqdiedby predetemunedfiKtor 35, 
38, 39 and 40, thereby boosting the tow frequency eneig^ 

20 To oeate die sensation of muitipie phantom loudspeakers over headphones, the HRTF 

fflgfPdqtfg associated with the location of each phantom loudspeaker relative to the hstener nmst 
becom^dvedwidieadidumnel. This convolution is accomplished usmg a digital signal processor 
and m^" be dooe in either the time or fiequeoc^^ domains widifito 16 to 32 tq>s. 

Because HRTFs difG? for ri^ and left ears, the single channel iiqiut to each HRTF prooessii^ 

25 circuit 10,ll,12,13andl4is processed in paralld by two sqiarate HRTFs, one fia- die right car 
and one for the left ear. The result is a dual channel (e.g., ri^t and left ear) output. Thisprooess 

is illustrated in Figure 6A. 

Figure ilhisliates the interactioa of HRTF nuttching process^ 
HRTF processing dreuit 10. Using the digital signal processor of HRTF processing dnniit 10, die 
30 ri gFifl l for fffiT ^ fhftiwi^il Af the iimlti^hflmid signal is convcdved widi two diflEerent HRTFs. For 
exan^le. Figure 6A shows the left duumel signal 7 being applied to the left and rig|it HRTF 
processing dreuits 43, 44 of the HRTF processing circuit 10. One set of HRTF coeffidents 
corresponding to die spatial locatim of the phantom loudspeaker relative to the left ear is apphed 
tAgi gpi^^^ ^ v'ft igft ear HRTF pmtts5singdrouit43, the odier set of HRTF ooeflBdents COTrespandiqg 
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to tbe spatial locati(xi of the phantom toudspeakff the ri^ ear and being qspUed to signal 

7 via the right ear HRTF processing ciroiit 44 . 

The HRTFs applied by HRTF processing circuits 43, 44 are selected firom the set of HRTFs 
that best matches the listener via the HRTF matching processor 59. The output of each circuit 43, 

5 44 is multiplied by a scaling factor via, for example, nodes 16 and 17, also as shown in Figure 4. 
This scaling factor is used to apply signal attenuation that conresponds to that \^iidi would be 
achieved in a free field environment. The value of the scaling factor is inversely related to the 
distance between the {dunilom loudspeaker and tbe 1^^ As shown in Figure 4, fl» right ear 
output is sunmied for eadi {tatom loudspeaker via node 26, a^ 

10 phantom loudspeaker via node 27. 

Once die left and right duumel signals are processed and contain signal infonnaticm 
necessaiy to provide die intended multi-channel sensation, tte signal can be transmitted to 
conventional two transducer headphones. These signals can be transmitted by wire or wirdessly, 
for exanq>le, by a radio frequency (RF) transmission system. Examples of wireless transmission 

15 systems are exemplified in Examples 2, 3, and 4. 

A central feature of this invention is to provide a suflficiently diverse and ccunprehensi ve set 
of HRTFs so that the user can select fixmi that set one HRTF set whidi will prxxiuce tbe poception 
of sound located in the proper spatial position. This selection process is accomplished herein by: 
(1) coUecdng a cona|Hefaensive database of HRTFs; (2) ordering the database so that a represmtative 

20 subset of the entire ooUecdoi of HRTFs can be obtained and stored in the device; and (3) providing 
a means for a user to select fiom the representative subset 

As described earlier, a single HRTF (see Figure 3B) is die q)ectrum obtained by presenting 
sound fiom a singtek)caticQ 110 (see Figure 3A). A listener's HRTF (head related transfer function) 
rrfers to the set of HRTFs obtained fitm the muh^le locatkms described, for example, in Figure 3 A . 

25 For aiQr source k)calion, two HRTFs are measured, one for the Ustene^ 

ear. Thus, ifLlocaticms are measured, the set of2*L spectra represent the HRTF 
listener. If S subjects are measured, an atire data base ccmsisting of S*L*2 spectra is goierated. 
In (me embodiment, 360 locations (L'°360) were measured and HRTFs oa over 150 subjects were 
collected. Thus, tbe tc^ data base consists of more than 108,000 spectra. These, or representative 

30 spectra are chosen (see below), and are stcmi in a database 63 (see Figures 4 and 6B). 

Forcdlectingdiese^)ectraaq)ecialrobotamiw^coostnicted^ Prior measurement devices 
im^dved die use of nniltq>leie.g., 12 Joudspeakois located o^ Eachofthemultq)le 
loudq)eakers were used to atate a signal used to measure die head*^fi^ Inusing 
these prior measurement devices^ signals from eaA of die multiple loudspeakers were projected fixm 
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a difG^mt locati(Hi to allow measurements of HRTFs for dififiorat elevations and azimuths. 
However, the use of multiple loudspeakers poses a problem. To avoid contamination of the 
measured HRTF, die difiBnem loudspeakers need to have equal outputs}^ Unfoitunateiy, it is 
only possible to equate such spectra to within about 0.S dB. 

5 Advantageously, in the present invention, an improved nieasur^^ 

utilizing a single loudspeaker located at the end of a robot arm. The single loudspeaker is used for 
all HRTF measuranents, thereby eliminating the problem of unequal output spectra of different 
knidspeakers. The single loudspeaker is predsely positioned by a oomputer^ontroUedrob^ 
in eadioftfaebcationsifitere an HRTF is to be measured. The presat HRTF measuranent device 

10 can measure and record a oan^lete set of 360 HRTFs for eadi ear, for an individual, in 
approximately 10 to IS minutes, as compared to one-to-four hours for prior measurement 
tedmiques. Because the listener should remain stationary during the »tiremeasurnnatpr^ 
the speeding-up of the measurement process can, itself, contribute to the accuracy of the 
measurements. 

IS Piovided in Figure ISA is a schematic ofapre&ned embodiment of an HRTF measurenient 

means according to this invention. At 200 tha:e is provided a speaker, preferably a 4 Ohm, 40 watt 
q)eakBr,fi3r example, pnxiuoed by Pkmeer. At 201, there is provided a lower arm, with dimensions 
aniroxiniatelyr wide, about 2" high and about 29** tong. At202,thereisim>videdandbowAC 
servo motor, preferably capable of high rotational speeds and torques (e.g. about 20,000 rpm, and 

20 about 200 oz.-in.), and an absolute encode (e.g. about SOO count/rev.). Affixed to the elbow AC 
servo motor, thae is provided an elbow planetary gearbox 203, prefierably with a ratio of about 
100: 1 and a toique equability of about 27S in.- lb. An upper arm 21 2 is connected to the lower arm 
201 through the elbow AC servo motor 202. At the uppo* end of the upper arm 212, that is 
provided a shoukler spur gear pair 204, preferably having a ratio of about 11.1111:1. Maintaining 

2S the shoulder spur gear in q)propriate linkage with the upper arm 212 is a mounting bracks with 
bearings205. The nuwnting bracket 205 is suspended finom a rotation shaft 206 having a diame^ 
of about 1-1/4". A rotatxm spur gear pair 207 is provided with a ratio of about 12.8: 1, to rotate the 
rotation shaft 206. ArotatiQnplanetatygearbox208,havingaratioof abom 100:1 andatmque 
cqiabiU^ of about 275 m. -lb., drives the rotation spur gear pair 207. A rotation servo motor and 

30 associated absdute encoder 209 hoviqg a speed of about 20,000 rpm, a torque of about 200 oz. - 
in., widi the encoder beir^g amenable to 500 countftev., are provided to actuate the rotation planetary 
gearboK208. Ashoukferplanrtary gearbox 210, having a ratio of about 100:1 and a tcv^queou^ 
of about 275 lb. -ia, is actuated by an assodated shoulder servo motor 21 1 having a speed of about 
20,000 rpm and a torque output of about 200 oz. - in. and an absolute encoder capable of about SOO 
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count/rev.» are linked to the shoulder spur gear 204 A^vristgeannotor 
213 having a speed of about SO rpm and a torque of about 178 oz. • in. with an associated analog 
encodo* are provided to position to the speaker 200. 

In Figure 1 8B, thoe is provided a detail of the q>pa* arm 212, the d 
5 203, the dbow AC servo motor and absolute encode 202, the mounting bracket with beariqgs 205, 
the rotation shaft 206, the shoulder planetary gearbox 210, the shoulder servo motor and absolute 
encoder 211 and the drive shaft 214. 

hi Figure 1 9, dioe is piovkled a schematic rqiresentation of the HRTF measurement control 
system. This inchides a central control computer 300 vMch, in a first loop, controls a servo 

10 contxxdler 301 \ri)idi drives a ph]ra% of servo amps 3Q2a-^ turn drive a plurality of linked 

encoder, servo motor and gearboxes 303a-c. Encode/servo motor/gearbox 303a drives rotaticm, 
while 303b drives the shoulder, and 303c drives the arm (see Figure 18). In a second loop, the 
central control conq)uter 300 controls data acquisition, signal presentation and speaker control via 
a feedback loop comprising: an encoder/gear/motor assonbly 304 for positi(ming the speaker 305; 

IS an A/D converter 306, a D/AoGm^erter 307, and an attenuator 308. The feedback loop links throu^ 
an 8nq)lifier 309 to the speaker 305 and to a micrq)hone pre-amplifier 310 ^ 
microphones 311a and 311b. It will be appreciated that the above described hardware, and in 
particular Ae spGdBcs of the various mc^ and gear power, rotaticm rates and ratios are all subject 
to modificatios without adversely affecting the general principal of Fq)id, automated HRTF data 

20 acquisition with inqnoved accuracy. 

The above described hanhvare mqr be contpolfed by software whi^ 
of Oe speaker. A prefened embodiment of such software is schematically represented in Figure 20. 
As can be seen, the software controls syston startup at 400, system initialization 
of a main menu 402. Subroutines 403-408 are provided which aUow for loading of data 403, 

25 speaker calibration 404, headphone measuronent 405, performance of an HRTF test run 406, 
performance of a full HRTF measuremot nm 407, and termination of the program 408. A 
schematic of a fiill HRTF measurement run 407 is shown in steps 407a-407q, all of vUch are 
initiated by seiectiooofeiement 407 at die main uttnu. At 407a the full HRTF measurement nm is 
initiated, fcdtowing wfaidi the measured subject is identified 407b, the robot arm is calibrated 407c, 

30 via a feedback loop 407d vH^th repeats arm calibration until a calibration '^OK" signal 407e is 
received. The robot ami is set to a 2Krostartii% position 407f, and the measurenientr 
407g. This inchides movement ofthe robot ann and speaker 407 h about the subject 
sets are being measured. The acquired data is played/recorded 407i and the HRTF azimuth and 
elevaticmis displq^407j on a monitor. A cwtinuous interrupt query 407k is sent and as long as 
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no interrupt signal is received, the measurement process is looped 4071 bade to measurraient step 
407g. If an interrupt signal is received, the system resets 407p to the main menu, 407q. If the 
meflSUrementrmrtinf l? f1?Pt«P1TH mtftmiprinn^ a cnmplete set of HRTFs are fneaamd imtil 

die natundterminaticnofthemeasureinent routine is reached 407niL Apause407nisinchidedin 
S the routine to aUowAe system to st<He407o the acquired HRTFs, af^ 
the main menu 407q. 

The headphone measurement 405 comprises steps 405a-4Q6h, \riuch are initiated by 
sdecting this option at the main menu: at 405a, the routine is initiated, foUowing which smnds are 
plQ^dirou^ die beac^pbcKK and diq^hyed 405b. A pause 405c is included in the routine to allow 

10 time for data retrieval and initiation of a subroutine 40Sd. If a particular heac^hone subroutine is 
not to be initiated 405e the system res^ to the main menu. However, if a particular h e ad phone 
subroutine is to be initiated, a particular headphone identity is entered 40Sf and the data acquired 
far that headphone is stored 405g following whidi die system resets to the main menu 405h. 

Optimally, the HRTF measurements are made in an appn^sriately constructed sound fochtl 

IS bi a prefened embodiment of this invention, die measurements are made in a room such as dun 
srhmiatically depicted in Figures 2IA, 21B, and 21C This room;, shown in a fiont view in Figure 
2 lA,provite an exhaust fian 500 and an air outlet channel 510. A latched door 520 is provided, 
preferably with latdies on both the inside and outside. A fiesh air fan 530 is provided fin* 
replenishment offifesh air fiwi the outside ofthe room through m InFigure 

20 21B, a schematic of a top view of the sound room is provided, including a representation of the 
subject seat 550, a monitoring camera 560, a pair of laser pointers 570, and sound absorbent walls 
580. In Figure 21C a detail of the wall cross section is provided, showing a double wall structure 
in which dim is provided two \ttym of dry waU 581 between which there is placed a damping 
material 5ffit, preferably selected frcmi foam rubba*, polyurediane or Uke sound insulating material 

25 AfurdierinqprovementinthepzesentHRTFineasurB^^ 

of the transducer eoqiloyed to record the sound signal used in calculating the HRTF. Prior 
measurement techniques attempted to measure die sound as dose to the eardrum as possible, by 
placing a nanow tube deq> into the outer ear canal to measure the HRTF just at the eardrum. 
However, dirou^ physical considoations of die nature of sound transn^ 

30 ear canal is fiw<>H, we cmfhwfr dutt only a plane wave travels in the ear canal below fiequmdes of 
about 23,000 to 26,000 hertz. Since only plane waves travd in the ear canal at these fiequencies, 
we expect that there is no directional information dmved from the effect of die ear canal on the 
incoming sound. Sixiceriodirectiond ixiformation is derived from propagation of the sourid down 
the ear canal, in the present HRTF measurement device and method, the transducer may be placed 
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at tbe entrance ofthe outer ear canal, instead of In 
addition to being less uncomfortable for the individual '"wearing** the transducer, the external 
location of the transducer provides a mudi higher S/N ratio than previous locati(Xis for the 
transducer. This U^ier S/N ratio pnmdes a more accurate HRTF,esp^ 
S HRTF where the greatest attenuation of the inc(»ning inq>ulse signal exists. 

The database of measured HRTFs is ordered by comparing the spectra recorded from 
different individuals. This is aoomplished by transfonning or pre-iHooessing the raw data to 
tepesent the percqmial features of the raw spectra more accurately. The raw HRTFs are measured 
as the impulse response to a digital signal propagated by a loudspeaker at a givm locatim. The 
10 signal so generated is careMly measured in the fiee-fidd (in the listener's absc^ 

imperfections in the spectrum of the loudspeaker. The measured impulse response is then converted 
to the frequency domiain using a &st Fourier transfmn 

theart. This frequemy domain rqircsentation is firther processed by implqnenting critical-band 
fihering and convming the data fixim a linear fiiequmry scale to a logarithm Critical-band 

IS filtering reflects the frict that the first stage of the auditcvy system contains bandpass filters whose 
bandwidth is a constant flection of the cmter fiiequoicy of the filter. The critical band filters 
resemble 1/6 octave banc^Miss fibers. In addition, the distance along the auditory displ^^ is rou^ily 
proportional to the logarithm of sound fiequenQr. Therefore, a logarithmic, rather than a linear, 
frequency scale is imposed on the representation. 

20 Inanexenq>Iaryembodinwnt, a gammatme filter is used to perform critical band filtering. 

The magnitude of the frequency resprase is represented by the fimction; 

g(f)«l/(l+[(f-fc)^/bn)' 
i;riierefis frequency, & is the center frequency fiird^ ERB varies 

as a fimction of frequency such Aat ERB »247[4.37(fc/1000H For each critical band filter, the 

25 magnitude of die frequency response is calculated Sar each firequency, f, and is multiplied by frie 
magnitude of the HRTF at the same firequenqr,£ For each critical band filter, the results of this 
ralnilflrifm at all frequencies are jMpi^rrf aytj s^m1n^ed The square root is then taken. Thisresults 
in one value rqwesenting the magnitude of the internal HRTF for each critical band filter. 

The hearing system is sensitive to a fixed fractional change in signal magnitude, which is 

30 known in the field as Leber's Law.*' Thus, if stimulus m«gnitiiH#> is represented on a logarithmic 
scale, such as decibels, die ear is sensitive to a fixed number of decibels. In sum, the internal 
sp6Ctrumisrq)resentedby the levd of the stimuhis in dedbels at about 12*18 fi^ 
indierai^betweenB and 18 kHz. Outside this frequency range (3 to 18 kHz) the human auditory 
system gains little or no directional or localization infonnation h^s^ on the s ^ ap ^ t^? y Hymihig 
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q)ecm]m. hfiact^fbv listeners but the veo^ young can hear ^ Atthe lower 

fiequeodes, the spectrum of the signal is cssCTtially the same for any azimuth or elevation. At the 
lower frequencies, however, especially below 4 kHz, differences in time of arrival at the two ears 
(interaural time cues) are iiiqx)rtant to indicate differences in the azimuthal position of the source. 

5 Such fibenpg results in a iKwstt of HRTFs, the internal^ 

necessary for human listening. If; for example, the fimction 20 lag,o is applied to the center 
frequency of each critical band filter, the fiequenqr domain rqpresentatioii of die internal HRTF 
becomes a log spectrum Oat more accurately represents the pereeption of sound by humans. 
Additicmally, ttie number of values needed to represent the intenud HRTF is reduced from that 

10 needed to represent Aernqnocessed HRTF. An exenq)Utty embodiment of ttie present invention 
applies critical band fihering to tfie set of HRTFs from each individual in the HRTF database 63, 
resuhing in a new set of internal HRTFs. The process is illustrated in Figure 12, wherein an impulse 
response waveform 80 shown in Figure 11 is filtered via a critical band filter 81 to produce the 
internal HRTF 82. 

15 The qipUcation of critical band filtering results in, for exanq}le, N logarithmic frequency 

bands located in the 3000 Hz to 18,000 Hz raqge. Associated with each oftheseN frequencies is 
ttie level in that band in dedbds. hi cne exemplary embodiment, N°39, the levels are measured with 
adensity of about 15 levels per octave. The entire data base, givmS subjects and L locations, is 
described by 2*S*L*N values and is iUustrated in Figure 13. This pre-processing summarizes die 
20 inoresaiiempereqjtual features ofthe acoustic filtering produced b^ 
a listener hears a sound at a given position in space. 

HRTFs obtained from the difiereni subjects and transfmned or pre*processed as described 
above can now be compared and organized so that their similarities and diffmnces can be 
quantified. One basic method of conq)aring two or more spectra is die single Euclidian di^^ 
25 Euclidian distance is equal to the rootHnean-squared (RMS) difference in decibels between the levels 
measured at the same fiiequencies in the two or more spectra. For a collection of HRTFs obtained 
fixm die ri^ ear of S subjects, we can ccHupare this 5^ by forming a di 
and S colunins, in which die entry (i,j) is the distance in (tedbels betwem 
HRTF ofthe'idi*' and ""jth"' individuals. Naturally, the distance measure is symmetric, so the entry 
30 (i, j) is equal to tte entry (j, i), and the distance between any individual and diemselves is zero, so 
die diagonal entries (i,i),viiierei'^, are all zm. It is on the basis ofthe similarities and differences 
between the processed HRTFs that the database is ordered. 

Having explaiiied how die HRTFs are measursd and preprocessed, we can now return 
issues raised earlier about how the user of the device selects a particular HRTF from those stored 
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intfaedevioe. The sdectkn process nuist ensure du^ 

positioQ for d)e individual user. Thus, the first issue to be addressed is \vlietfaer die ent^ 

of measured HRTFs is suffidoitly broad and comprdiensive to represent the entire listening 

populaticHL Inone exenq)laiy embodin^nt, ISO HRTFs were measured fiom a population in which 

S both genders and a variety of ages and ethnicities were represented. 

Statistical tests ofthis database suggests that ISO HRTFs constitute a set size sufficient for 
die purposes of the subject inventioa These tests were all conducted on a sample consisting of ISO 
s^ measured acooniing to dustnventioa Three HRTFs fitmeadiHRTF set were selected for these 
ooiq)ari80iis» namdy, on the horizon (0 elevation) and at 10, ^ 

10 ahead. It is expected that similar oonchisioQS about stability would apply Each 
of the diree HRTFs from each HRTF set ocmsists, for example, of values iq)resenting the level of 
the HRTF, at a plurality, e.g. 39, of different frequencies. The 39 frequencies are spaced equaUy, 
on a logaridmiicfirequency axis, from about 3,000 to abom 18,(k)0I^ Few listeners (except the 
vcQryoun^canhearsoundabove 18,000 Hz. The composite spectra obtained om the 3 positions 

IS can be regarded as a vector consisting of 117 levels (dB). 

To investigate the issue of database size, we constructed diffoent sized sets of HRTFs by 
drawing d^ at random from die original group of ISO HRTFs. Set sizes of 20, 40, 60, 80, 100, 
and 120 HRTFs were constructed. For each ofdiese randomly constructed sets, a single HRTFs is 
drawn at random and die distance from that individual's HRTF to its nearest neighbor is computed. 

20 These random coDStnicdons are repeated many times so dutt the probability of a given distance can 
beestimated. Figure 22A shows a plot ofthe cumulative probabiUtyofthat distance for ^ 
difibent set sizes. For exanq)le»ifdie set size is 20, then the RMS distance in decibels to the nearest 
nei8M)or is less flum 2 dB for onty about 55% ofthe individual HRTFs. If the set size is inoneased 
to40HRTTs,tfae&nicxedian7a%8rewidm2dB. As die set size increase to 60, 80, 100, and 120, 

25 little increnMntal advantage is acUcved by adding fiirther HRTFs to the This analysis 

demonstrates that the basic difiGnences in HRTFs moag dififoeot individuals is adequately 
lefKesenied in a database haviiig more than about 100 HRTFs. That is to say, with a raw database 
oonraining 1 00-200 HRTFs there is a ve^^ hi^ likelihood that a randomly selected individual would 
find an HRTF suffidotly close to his/her own so as to properly spatialize sound 

30 Another way to q)proach the issue of stability is to compute a significant statistic of the 

dataset and detennine how it changes as we vary set size. From the ISO conqx>site spectra, or 
vectors, a centtdd HRTF is ccanputed The centroid, itself having 117 levels, is obtained by adding 
together, for each of the 1 17 levels, the value representing the level of the HRTF from eadi of die 
150 oogqMsiteqiectra and dividing eadisimi by die sample size, ISO in the sample. Ifeachofdie 
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1 SO oooq)osite spectra are treated as a point in a space of 1 1 7 dimensions, the centroid is the center 
of gravity of the set of ISO points. 

The Euclidean distance betwea the centroid and each of the ISO composite spectra (RMS 
distance in dB) can d)ai be measured The mean ofthis distance is about 2.S3dB, and the standaid 

5 deviation is about 0.76 dB. Figure22Bshowsanestimateof a cumulative density function, 

is a plot oftbepid)abitityofan individual being less than a given value^ As 
is shown in Figure 22B, the nearest individual in the space was about 1 dB fiom the centroid; 
a pp ro xim ately half the sanqple was within 2.5 dB of the centroid and about 95% were within 4 dB. 
Also shown in Figure 22B, as a sohd line, is a cumulative distribution firan a normal or Gaussian 

10 distribution with the same mean and standard deviation as the sanq>le, namely, mean » 2.53 dB, 
standard deviation » 0.76 dB. The data dq>artsomevriuit from diis theoretical disiri 
similarity is evident. 

Given that diese data are reasonably qjproximated by die normal 
because the Gaussian distribution is compl^ly described by two parameters, the mean and the 

15 standard deviation, the stability of the data is assessed as the munbo- of HRTFs measured is 

inoeased or dec r eased thus defining larger or soudler databases of HRTF subsets, and obsoving the 
e£Gect this has on the mean and standard deviation. F<a: this assessment, random subsamples are 
drawn fixmi the large sanqple of 150, and the mean and standard deviation of each subsanq)le was 
calculated. One thousand randomly drawn subsanq>les for each of five subsample grmq> sizes, 

20 namdy 5, 10, 20, 40, and 80, were taken. Both a mean and standard deviaticmofthe RMS distance 
fiom each ofdie HRTFs in the subsample to the centroid were co^ The average of the 1,000 
means and the average of the 1,000 standard deviaticms, for eadi subsample group size were 
computed. Figure 22C shows die change in the average mean as the number of HRTFs in the 
subsanq)le increases. Figure 22D shews the dmge in d&e average standard deviation as the nuinber 

25 ofHRTFs in die subsan^le increases. As can be seen fixnn Figure 22C, the average mean dianges 
byabout 10%invah2eastfaesubsanq)kgnxq)sizegoesfinomS toSOHRTFs. The last point cm the 
graph is the mean, 2.53 dB, for all ISO HRTFs. Similarly, refonng to Figure 22D, the average 
standard deviation changes by about 25% in value as the subsample groiq> size goes fiom S to 80 
HRTFs. As can be seen fixmi both Figures 22C and 22D there is very little change in the average 

30 mean or average standard deviation for subsample group sizes, for example, greater than about 50. 

Thus» the two critical statistics of the 150 measured HRTFs are leascmably stable, and we 
have found that little statistical inq)rovement would be gained increasing the sanqile size much 
bqrond 150 sano^les. 
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While the preceding has established that the initial database is sufficiently conq)rehensive 
to cover an entire pqniiation of listeners, it should also be appreciated that not each of the 1 00*200 
HRTFs contributes equally to that result This is because there is considerable similarity or 
correlation between certain groups widiin the entire database. This fact suggests that the raw 

5 database can be pnmed in some fiEishion to reduce the total number of HRTFs actually stcmd in the 
device. Several different statistical techniques mi^t be used to provide an organization of the 
database that reveals the underlying correlations. These include one of the variety of 
multidimensional scaling procedures known in the art. The procedure used in one exemplary 
embodiment herein was cluster analysis. Specifically, we used a hierarchical agglomerative 

10 chistaring procedure such as that executed by the statistical This procedure uses 

similarities between the HRTFs as measured in a distance matrix ISO HRTFs to produce an 
ordmd tree-like structure to the data. Atthehighestnodeof the cluster, all of the HRTFs are 
contained. Successive nodes contain HRTFs that are similar to each other and diff^^ 
remainderjust as biologicdanimak are classified as orders, gen^ Figure IS shows 

IS a sanq)le cluster of HRTFs obtained firom four subjects. Inq)licit in this example is the fact that 
HRTFs of the left and ri^ ear of a sin^ subjett are usually nearer in distance than are one person's 
HRTF to aoy other person's HRTF. Chistering provides a oonvenioit ordering of the entire database, 
iBihg^nf HRTFg can eadly nhtflii^ by sdflcttng similar groups detennined bv the nodes 
in the cluster. Those skilled in the art will recognize fiom this disclosure that other methods of 

20 mlering known in the art could be used. 

A rqmsentative subset of HRTF sets fiom the entire set of ISO HRTF s^^ 
listener can be matched, is chosen to simplify the matching process. In one embodiment, the HRTF 
sets within a R|3resentativesid3set are stored for use according to the n^ The 
greater the number ofHRTF sets ^ored in the device, fi^ 

2S likely Oe listener will be matched to an HRTF set similar to the listener's own HRTFs. The 
disadvantages of having a very large number of HRTF sets stoed in the device are that more 
mcmoxy is required to store the HRTF sets, with an aoconq)ai9ingincre^ In 
fldditiOP, it ymi^4 tftl^e ntm^. t^m^ match the listener with the hest-mMch HRTF set 
In order to balance the oompetiDg fiu^ in detennining the number of rq)^^ 

30 sets to inchide in die device, we computed the mean minimimi RMS distance between an HRTF set 
noKlomly selected fiom the entire measured database of HRTF sets (^^^ ISO HRTF sets) and the 
representative HRTF set, fiom the subset of rqnesentative HRTF sets chosoi to be in the device, 
nearest to die randonly sdected HRTF sd, as a fimcticHi of the numba of representative HRTF s^ 
chosen to be induded in tibe device. Figure 22E shows the results from two differrat algorithms for 
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selecting representative HRTF sets. These results are typical of those obtained using a variety of 
algQntbmskm>wnintfaeart\^iiichcanbeused rq)res^tive HRTF sets fiom the database 

ofHRTFsoidered,f(vexan]ple»bydustennganatysis. The illustrated results fiom both algorithms 
Aaw the same trends, whethor om selects rq>resentative HRTFs fiom the ordered database based 
S on the ''popularity" of the representative HRTF (i.e. an HRTF diat is closest to the other HRTFs 
vnOm a given subduster), or based on the isolation of the representative HRTF (i.e. an HRTF most 
distant from other HRTFs within a given subcluster). Namely, as the number of iqmsentative 
stored HRTF sets decreases from 1 SO to 12-1 5, the mean minimum RMS distance increases slowly. 
Bdow about 12-15 stored iqiresentative HRTF sets, the mean 
10 rapidly. The lowest RMS distance is IdB because IdB is the average RM^ 

measurements offhe same individual's HRTF set. Thus, in the present analysis, when an HRTF s^ 
randomly chosoi from the 150 total HRTF sets is one ofthe stored HRTF sets, a value of 1 dB is 
used to rqnesent the RMS distance, not OdB. Accordingty, the lowest possible value for the RMS 
error is 1 dB. 

15 hi one embodiment, 25 HRTF sets is the number of representative HRTF sets to be stored 

in the device, for listeners to select from. This number, 25, is well bebw the ""knee" ofthe plot in 
Figure 22E, and is thecefiore a dearty adequate iq>resentative set size, thus baland^ 
of haviog a hi^ number, example, a closer ultimate match of the listed 
disadvant^es of having a higher number, fior example, higher memory cost and a longer wm»i;*hitig 

20 time for the listener. In one specific embodiment, the listener first chooses from among 5 
iepresentativeHRTFsets,eadirq>resentativesetrqxresentingasetof5si^ Once 
one of the 5 rqmsentative sets is sdected, the user selects from amori^ 
in die stt of HRTF sets conespcmding to the selected rqiresent^ 
b anclfaeri»«fened embodiment, 15 HRTF 8^ is th^ 

25 to be stored in the device for listens to select fixnn. This number is approximatety at the ''knee" 
ofthe plot in Figure 22E. Having discovered fixBn the afbredescribed statistical anatysis of our 1^^ 
ordered database that 15 rq)resentative HRTF sets is sufiidait to allow the vast majority of the 
population to select an HRTF s^ that will allow proper audio spatialization, the 1 5 rq)resentative 
HRTFs may be sdected as follows: the entire database is ordered sudi that the distance metric 

30 (Eudidian distance, RMS distance, etc.) betvveea every HRTF and evo^ 

isknowa Ibis, in a first step, evoy HRTF set that is a distance x,e.g., 2, dB away from a 
HRTF stt in the database is identified. This identification is made for each HRTF set in the 
database, and a listiqg is made of each HRTF stt and aU of the HRTF sets wi^ 2,dBof 
it, from the most pqpular to the least popular HRTF seL The most popular HRTF set is that set in 
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tbedaiabase that has die nuistHRTF sets within r, e.g., 2, dB of it In a second step, the process of 
selecting IS representative sets proceeds by first selecting the most popular HRTF set as a 
repesentativeHRTF set, andiheneliminatingevoy HRTF set that was within x, e.g., 2, dB of the 
most popular HRTF set from fiirdier selection in the database. The next most popular HRTF set, 
S which was not diminated upon die sdection of the nK^ popular H^ 
seccHKl rqsresentative HRTF set, and eveo^ remaining HRTF set in ^ 

of this HRTF set is accordingly eliminated. This process is rq)eated, moving down the list of 
popularity ofHRTF sets that remain in the database. Once IS rqnesentative HRTF sets have been 
selected, the process may be terminated. Naturally, it will be recognized that fewer or more 
1 0 r^resentative HRTF sets may be selected and that a stringency, i.e. , x, of greater than about 1 dB 
to about4dBmqr be inq)osedan»und each of the most popular HRI^^ IS- 
2Srqireseatative HRTF sets from the entire database ofmeasured HRTF sets. From our statistical 
analysis, we have found that 15*2S rqxresentative HRTF sets is preferred tar the considerations 
provided above. 

IS Onoe a number of HRTF rq)resatative sets have been selected, the user selects the HRTF 

set that he/she wiU use in Ustoiing to pn)grammat^al by any of several difi^ One 
procedure is to present, via heac^lKmes, sounds filtered by a variety of HRTFs to conv^ the 
impression ofphantom sounds rotating about the listener's head The programmed sounds are in fact 
all chosen from elevations on the horizon. What is generally true of HRTFs is that the variation in 

20 the fittered spectrum decreases as elevation increases. That is, the HRTF is generally flatter as the 
elevation of the sound inoeases. tt is also true Uut a listener using an HRTF that is very dissimilar 
to his/her own will tend to hear the phantcmi sound much higher in elevaticm than that programmed. 
Tbus, when a listmer hears a sound at a lower elevation, it generally means that the listener better 
qipredates die structure in those HRTFs. Consequently, ifcme listens to a set of different HRTFs 

2S programmed to pn)duoeAe circle (tfiAaotom sounds on 

1 0, the HRTF set producing die lowest apparent elevation will provide die best means to localize 
sound in die correct spatial location. 

Summarizing the foregoing description, the presoit invention uses HRTF clustering as 
iUustrated in Figure 6B. As discussed above, die present invoiition collects and stores HRTFs from 

30 numerous individuals in the HRTF database 63. These HRTFs aie pre-processed by the HRTF 
ordmng processor 64 ^di includes an HRTF pre-processor 71, an HRTF analyzer 72 and an 
HRTF clustering processor 73. The HRTF pre-processor 71 processes HRTFs so diatdiey more 
closdy match die way in which hunuuispercdvesoun4 as described al^ Xhe 
smoothed HRTFs are sta ti s t icall y analyzed, each one to every other me, tn determine gii ^^y lBritiffS and 
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differences between them by HRTF analyzer 72. Based on the similarities and differences, the 
HRTFs are subjected to a cluster analysis, as is known in the art, and as described above may be 
'"pruned" to anive at a representative set of HRTFs, by HRTF clustering processor 73, resulting in 
a hierarchical groining of HRTFs. The HRTFs are tbra stored in an ordered manner in the ROM 

5 65 fir use by a listener. Fnsm these codered HRTFs, the listener selects the stf that provide the best 
matdi via the HRTF matching processor 59. Fnm the set ofHRTFs that best match the listener, 
the HRTFs qypDphate fisr tbe locadoo of each phantom speaker are input to their respective logical 
HRTF prooessmg circuits 10 to 14 of Figure 4. 

Having provided a general description of the subject inventi<m, (see Figure 4 above), a 

1 0 specific embodiment thereof is described in greater detail with reference to Figures 23 through 28 
hereof 

Referring to Figure 23A, after measuring HRTF sets torn a suStcientty lai^ number of 
individuals, ISO individuals in this example, and performing clustering analysis to select the most 
representative group of HRTF sets, IS HRTF sets in this example, the listener is matched to or 
IS selects a best^natdi HRTF set from fliB IS most representative HRT^ Initially, the HRTF sets 

of the most rqnesentative group of HRTF sets, including the user selected best-match set of HRTFs 
are stored in an extenud EEPROM 704 to be accessed during the matching process. 

Once ttie most representative group of HRTF sets is stored in the external EEPROM 704, 
aniqnit Ieft601 and ri^ 602 audio signal, typically from a CDplayo:, VCR, laser disk plqw, or 
20 like source of audio signal are inputted to a circuit 600 for processing of the signals to achieve 
accurate spatializaticHi of the sound transmitted to the user of the headphones. 

The circmt 600 may be custcm burned into read ody memory on a silicon or lite 
an off-the-shelf, commercially available chip, such as a Motorola DSP S6007 diip, may be 
programmed by downloading the appropriate coimectors to an electrically erasable programmable 
2S read only memory (EEPROM) 710 vMdi rec(mfigures the DSP S6007 chip each time the diip 
'"wakes up.'' Referring to Figure 23B, within the circuit 600, the signals are first routed to a Dolby 
Prologic® or like decoder 603, a weU defined Dolby Uboratoriessta^ The 
Dolby Prdogic® decoder 603 provides four output channels, left 604, right 60S, center 606, and 
surround 607, intended for loudspeakers located to the firont left 608, front right 609, front center 
30 610, and rear center 611 of the listener, see Figure 23C, respectively. Before processing the several 
output ^i^ming^g, sudi as the four Ddby PrQk)gio(S) channels , by filtering with HRTFs, preferably the 
center signal 606 is preprocessed within an early reflection 612 processing circuit, to 

ffjft^ ilafft fyrly refl^fms that sound waves would encounter in a non-anechoic environment The 
output signal of the early reflection processing circuit, the left early reflectim 6 1 3 and die h^t early 
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leflectkm 614 signals, ait preferably added 615, 616 

cbamiel signal 605, respectively, yielding early reflection processed left 627 and rigjbt channel 628 
signals. 

Reforing to Figure 24, one embodiment of this early reflection preprocessing, vMdi is 
intended to provide a sense of direction and spatial cue, comprises delay tap lines 618, 619 with 
variable Imgth filter delays 620, 621 and variable magnitude gains 622, 623 for the left and ri^ 
eari)^reflectims,respectivety. TTie length of tiie delays 620, 621 and the magnitudg nf thg pjinc 
623 can be adjusted, aoocnding to the simulated early reflections to be inqxised on the signals, by, 
for exan^le, ambiance 696, Oeater 624, haU 625, or chib 626 conln>lta Means for achieving 
early reflection processing are known in the art (see US. patent No. 5,371,799, inoofporated heie 
by refimnoe for this purpose). 

Referring again to Figure 23B, next, within the circuit 600, the multiple channels of the 
signal 627, 628, 606, 607 are processed 663 to create the sensation of phantom loudspeakers by 
filtering each channel of the signal with a pair of HRTFs, ftom the best-matdi HRTF set, 
ccHitspooding to the intended locati(xi for that channel. As noted above, before the HRTF filtering 
can occur, the user is matched to a best-niatch HRTF set The user is preferably matched to a best- 
match HRTF set, fiom among the most rq)iesentative group of HRTF sets of the total database of 
HRTF sets measured so that when used to process an audio signal the user perceives the 
correspcmding sounds to be localized in the proper spatial positions. 

Referring to Figures 28A and 23A, one example of how this matdung is accomplished is 
shown in detail. The HRTF matching pn)cess begins by the user pushiiig an HRTF ina^ 
contopl hutiim (Ears contro!) 629, thus entering the HRTF n^^fnlii^ig iwdf This places the uscr in 
mat c h m od e 1630, hi match mode 1 630, die user may select firm one of five chisters of HRTF sets 
(sets 1-S) in die test bank. Representative HRTFs fixmeadi of the five clusters are copied ficom the 
extenial EEPROM 704, MUcfa stores Ae niost rqjresentative HRl^ 

see Figure 23A,ofcircuit 600, fir testily The testing is accomplished by presenting the user, upon 
the user pushing a noise control 703 button, with sound signals produced by a wliite noise process 
632, Figure 28B, with a linearly dxaymg envelop 633. The user is first presented with a sound 
processed by an HRTF 640 conespondii^ to a fim predetermined virtual location, e.^^ 
speaker 634, see Figure 28C, and then the user is presented with a sound processed by an HRTF 641 
oQzreqKmdii^ to a second predetennined virtual location, e.g., t^ 

the representative HRTF sets ofthe five chisters copied to the RAM 631. The uso* sequentially 
listens to each representative set Iqr using the HRTF matdiing control ^ 
the rcpresemative HRTF sets 1-S, and ultimatdy selects 
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using a icfuesoitative HRTF set fiom one of die 

clfarly flFPv*"g fa^ fa™ ri^ horizon to the uso-^s front left and then arriving from the horizon to the 
user's rear left. In this anbodiment, the user selects die clearest sound signal by pressing the OK 
bulton637. The selected sound signal conuponds to the rq)resentativeHRTF^ 
5 the clusters of HRTF s^ (I-S) which contains the first approximation of the user's best-match 
HRTFsct. 

The next stq) is for the HRTF sets (sets 2.1-2.5 in Figure 28A) from the cluster 
ooneqpondixig to the seteded sound signal to be c(q>ied 1 ,000 from tte extraial EEPROM 704 into 
die internal RAM 631 for fistfaBTsdectioob/ the user. Once again, the user is presented widi sound 
1 0 signals produced by a ^te noise process 632 widi a linear^ decaying envelop 633 processed first 
by die HRTF640conesponding to the front left speaker 634 and then processed by the HRTF 641 
CQneqxmding to die rear left speaker 635 , for e adi of die five HRTF sets 2 . 1 -2 . S widiin die c luster 
corresponding to the previous^ selected representative set (set 2 in Figure 28A). The user then 
sdects >riiidi of ttic sound signals, farh flmri wi*^ ^ HRTF sets (sets 2. 1-2.5 in Figure 
IS 28A) of the selected cluster, (2), v^iiich the user perceives as most clearly arriving first from the 
horizxm to the user's fixmt left and then from die horizon to the us^s rear left. Again, in this 
embodiment, the user sdects this sound signal by pressing the OK butum 637. Upon pressing die 
OK button 637, die user has selected die user's best-match HRTF set, for example set 2. 2 in Figure 
28 A, and the usor leaves match mode. 
20 In one embodiment, die majmty of program material produced by a Ddby Prolqgic® 

decoder is ccmtained in die front speaker k)catian(kication 610 of Fig^ Thus, die device can 
enable the nflt^^^g process by producing a transient dick-lite stimulus e.g., a iwbite noise process 
632, filtered by an HRTF appropriate for die firontal position. Fifteen such HRTFs are used, each 
qipropriatefisrtfaesetofHRTFs associated with the 1 5 rq>resentative individuals diosen from the 
2S endiepopulaticHi of ISO HRTFs. The user select diat HRTF ^ch produces die clearest percqition 
of a phantom sound source located directly in front of the listener. This can enable the matching 
process to povide a match based on die needs of die application. It should be appreciated diat other 
tests may be more appropriate in odier applications, but this simple test is adequate for the currmt 
i^lication. ¥of example, if the qipiicadon requires spatiaUzati<m of sounds to the sides, HRTFs 
30 conesponding to die sides can be used in die matching process. 

In one embodiment of this inventkm, a seat control button 643 is provided vAich allows the 
user to select where the user will ^'sit'' in the virtual room widi respect to die virtual for 
exanqile, die user can sdect die front«of-the*room 644 seat position, m which case the sound whidi 
is to qipear from the left 634 and ri^ 645 front phantom speakers wiU be gm 
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sA (2.2.4 in Figure 28A) measured firom an appropriate azimuth an^e, Le., 40 degrees azimuth left 
orrigfitrespectivdy. Inadditioa,fcH'tfaefixHit-of-di&-rcxmiseatpo^^ 

center 646, and fiont right 645 virtual speakers will be louder than the rear virtual speakers. In 
contrast, if a rear-of*the-room seat position 647 is diosra, the front left 634 and right 645 virtual 

S speakers will be generated by an HRTF set (2,2.1 in Figure 28A) measured from a smaller azimuth 
angle, ia, 10 degiees azimuth left or ri^ respectively. Additionally, for the rear-of-die-room seat 
position 647, the front left 634, front center 646 , and front right 645 virtual speakers will be softer, 
than the rear left (surround left) 635 and rear right (surround rigjht) 648 speakers. 

Once die user has sdected a seat position by pushing a seat CMtrolbuttm 643^ lOHRTFs 

10 651-660, conespondiiQ to the sdected seat position and the best-matdi HRTF set, are copied fccm 
the external EEPROM 704 to the internal RAM 631 for use as digital filters. The 10 HRTFs 
conespood to the front left, fioot center, front ri^t, rear left (surround left), and rear ri^t (surround 
right) virtual speaker locations, with a left and rig^ HRTF for each position 651, 652, 653, 654, 
655,656,657,658,659,660. These 10 HRTF sets (651 through 660), from the best-match HRTF 

IS set (2.2), provide the user with a best-match to the us^s own head and pinnae filtering 
charactoistics and simulate the user's selected seat positicm. that for each of the 4 seat 
positions 644» 661, 662, 647, 10 difiEerent HRTFs are copied to the RAM 631. 

Refisnii^ to Figure 25, once die 10 HRTFs (651 througji 660) are m 
and available for filtering of the signal, the fiw standard Dolby Prdogic® outputs after early 

20 reflection prqirocessing, 627, 628,606,607, are fed to the HRTF processing dr^ bone 
embodiment of the present invention, a fifth channel (second sunound channel) 664 may be 
generated by opticmally inverting 665 die single Ddby Prologic® surround channel 607. This 
invo^on 665 aids in decorrelattng the two surround channels. These two surround channels 607, 
664 then become rear left (surround left) 607 and rear right (surrrand ri^t) 664 channels. 

25 According^, the sunound right diannel 664 is identical to the surround left 607 diannel, although 
possibly invoited. Each of the five duumels (left front 627, cento- front 606, right fiont 628, left 
rear 607, and right rear 664) is then split into a rig|it and left channel for filtering by the 
corresponding HRTFs (651-660) stored in the RAM 631. 

Referring to Figure 23A, to prevent loss of HRTFs and other operating mode parameters 

30 selected by the user at power-down and power-iq>, an EEPROM 710 st(ns aU current parameters 
of die ^stem includiqg current HRTFs, and its stoml data is not disturb 
events. This EEPROM can save; after sdecdon by user, multiple operating mode parameter 
which can be pulled up by a user by, for example, pushing a button. 
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The HRTF filtering of the 5 left and S right channels is acconq)lishe(i by convolving (or 
mixing) each diannei with the HRTF, from the best«matdi HRTF set, corresponding to the givm 
location and to the given ear. The convolution of these 10 signals with the corresponding HRTFs 
produces signals v^ch produce sound corresponding to virtual or phantom speakers at locations 
5 ooreqxxHling to the locations fiomMiiich Once the 10 convolutions are 

ccnqdeted, Ae S left signals are summed 666 to generate 

signals are summed 667 to generate a summed right signal 669. These left 668 and rig^ 669 
summed signals can be semdiiecdy to a set ofheadphonesfo virtual speate Howcvei, 
additional piocessmg of ttie summed left 668 and fight 669 sig^ 

10 by the uso* may be performed. This further processing eliminates the inq)ression of being in an 
anedioic chamber with the five speakers generating the sounds. Sound in an anedioicdiamber does 
not have the same ''fiilfa^ss'' of sound as if the user woe in an edioic chamber. 

RefOTing to Figure 23B, to enhance the ""fidlness'" of the sound expoienced by the user, 
bass boost 670 and revoberation 671 processing is preferably performed on the signals befcm 

IS presentation to the user over heac^honcs. These are well known processes in the art In particular, 
both the left 668 and ri^ 669 sunmed output from the HRTF processm 
boost jnocessiivbkxfc 670. Rcfening to Figore 27, dus circuit 670 comprises, fiir example 
Hz kiwpass filter 672, 673 for each signal, left 668 and ri^ 669, to produce signals 681 and 682 
followed by an amplificati<m 674, 675 ofgainGfi for eadi signal, left and rig^^ ThegainGecan 

20 be acQusted, per the user's prefiBrence,iq> or down to adjust die amo 

by using the bass conHd button 680. The left 676 and ri^t 677 ou4>uts of the respective anqilifiers 
are then added to the respective left 668 or ri^ 669 input signal to produce a left bass boosted 
output 678 and a right bass boosted ouQ)ut 679 signal. The left bass boosted output 678 and rig^it 
bass boosted ou^ 679 signals are essentially die original signal 668, 669 with an added component 

25 comprising times the respective output 681, 682 of die signal through a 100 Hz lowpass fiUa- 
672, 673, thus boosting the bass component of the signals. 

Refaring to F^uic 23B, die left bass boosted 678 and ri£^ bass boosted 679 ou^ 
are thai added to the output of a reverberaticm processing circuit 671, vtoe die inputs 604,605, 
606, 607 to die r6vert)erBdan processing bbdc are die original four s^^ 

30 ou^mtsbefiareaiyodier processing. Tbeieverberadonprocessii%671, in coiyunction with the early 
reflection processmg 612, provides the or ardiitectural enhancement that an anechoic 
rqgesentation lacks. Refinnng to Figure 26, d^reverbenoicm processing circuit 671 c^ 
all-pole comb filtos 683, 684, in parallel, the summed output of >^iich 692 feeds into two all-pass 
filters 685, 686 in parallel The four standani Dolby Prologic® or like ouq^uts are first suo^^ 
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together ami the sum 688 is then inputted to tbe fi^ 

684. Each all-pole caoab filter 683, 684, as shown in Figure 26, loops the input signal upon itself 
over and over again with the volume reduced by some fitictional amount for each successive loop. 
The looping has an associated time dei^^, t = [k] 690, and gain. Go 69 1 , which can be adjusted to 
S suit the user, and are adjusted by the user dioosing among a tfaeat^ 624, hall 625, or club 626 
setting, with eadi setting bamg a unique pairing of length of time delay, t = [k] 690, and magnibtde 
of fiadiOQal gain, Gc^l. The summed output 692 of the two comb filters in paralld feeds two all- 
pass fihos 685, 686 in parallel. These all-pass filters provide a sniearing effect in time to tt^ 
at its input withnitdisttirbing the fiequencyduufacteristi^ The all-pass filters are non- 

10 linearphasedistCHtersandrmovesoineofthephaseinf(»mationasafimcti This 
allows deoondation of the left 693 and ri^t 694 reverberation ou^uts, even thou^ the iiq>ut 692 
to the left and rig^t all-pass filters is the same, without disturbing the fi:equency profile which is 
embedded in the signal fiom the HRTF processing. The level of the left 693 and right 694 
reverberation outputs is a fimction of gain, G|t 695, whidi is controlled by the ambiance control 
IS button 696. 

Referring to Figure 23B, the left 693 and right 694 reverboation outputs are summed 697, 
698 with the left 678 and riglit 679 bass boost outputs, respectively. These summed left 701 and 
ri^ 702 signals are the left audio out 701 and right audio out 702 signals respectivety^ Theleft 
audio out 701 and rig^ audio out 702 can be sent direct^ to a srt of head ph one s to provide the 
20 listener with the sensation that the audio is originating frcmvirtaial speaks 

to the seat oontrdsdection made by the user, bone embodiment, the headphones are connected via 
wire to outputs 701 and 702. In another embodiment, 701 and 702 are signals sent via wireless 
connection to a set of headphones (see Exanq>les 2, 3, and 4). 

Based oa the foregoing disclosure, ttiose skilled in the art will appredate that the method 
25 of selecting the best matdi set of HRTFs fiom a sufficiently large database of measured HRTFs may 
be varied considerably, witiiout departing fitmi the prindples of this invent Accordingly, with 
lefierence to Figure 29A, by analogy to Figure 28A, with primed reference num^als in Figures 29A 
and 29B rdating to hke dements in Figures 28A and 28B, it will be appreciated that a rq)resentative 
set of IS HRTFs (sets MS) may be stored in the test bank. The IS represoitative HRTFs used are 
30 predidBd to accommodate roughly 9S% of the population, wi A respect to variations in the spectrd 
properties of dieir impulse respwses. Again, by andogy to Figure 28A and the foregoing 
desciqition, tfie HRTFs ate copied, one at a time, firom ^ 

ofthe DSP dup for testing. The user may test these HRTFs by asserting a test signd, see Figure 
29B, which win be comprehended by andogy to Figure 28B. A i^te noise process with a linearly 
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decaying oivelope is played from the Center (C) speaker (see figure 28C). The user chooses the 
HRTF set Aat best fits the following aiteria: (a) the sound source is localized directly in front of the 
user, and (b) Uie sound source is localized at the horizon (i.e. on a horizontal plane defined by the 
user's pinnae). Once the user has identified a set of HRTFs that satisfies these criteria (i.e. has 

5 selected a best match HRTF set), the user exits matdi mode. The seating position can then be 
ac^usted, as described above with reference to Figure 28A, by selecting the 1 0 HRTFs used by the 
HRTF process^ to localize the virtual sound souices. In this scenario, the user is spared an 
intennediate stq) of HRTF matchiog used in the system shown in Figure 28A. 

From the foregoing disclosure, those skilled in the art wiU also recognize that in an 

10 alternative embodiment, rather than matching a uso^ to a representative set of HRTFs wherein the 
HRTFs used to process an audio signal, for each spatial position, is measured from the same 
individual, auser can instead be matched to separate representative sets of HRTFs for each spatial 
posiboa The user would perfonn a matching step for each spatial location, wherein a subset of each 
representative set, selected for the desired spatial position, would be used to process the audio 

IS signals. We shall refer to this set herein as a Multi-Position Head-Related Transfer Function or 
MPHRTF, 

hi selecting the MPHRTFs, the listener would oqperience a sound source at each 1^^ 
The sound source may dmgp fiv eadb h)cadon depending on the objective criterion at that location. 
For exan^le, tte somd source may be speedi fisr a location in 
20 to be presented. Another may be filtered white noise for those locaticms that wiU present anibient 
noise. 

In selecdng these HRTFs for each location, a listener would be allowed to choose across 
muhiple sets of HRTFs, vtee a set of HRTFs is defined to be tfiose recmled from a sin^e subject 
This allows the listener to custom develop a ''user's set of HRTFs" that best desoibe his/her 
25 localization and peroq>tion characteristics at each location to be presented Furthermore, an 
interpolation algorithm could generate intermediate locations for the user's set of HRTFs as a 
mixture of the selected HRTF sets. 

Other variations and modifications of these selection schones will be obvious to those 
skilled in the art based on this discbsure. 

30 

Eaymiplff 1 

In a spedfic embodiment, the statistical analysis of HRTFs performed by the HRTF 
analyzBT 72, shown in Figure 6B, is perfbnaoed Ifarouigh computmion of eigenvectors and dgenvahies. 
Such computations are known, for example, using tfie MATLAB® software program by The 
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MatfaWorks, Inc. An exenq)laiy embodiment compares HRTFs by computmg eigenvecttMrs and 
eigenvalues fiff the of 2S HRTFs at L * N levels. Each subject-ear HRTF set may be desoibed 
by one (ruKRe ^envahies. Only those dgavalues computed from eigenvectors that contribute to 
a laigepoitiaiofthediarcd variance are ised to describe a set of subject-^ Eadi subject- 

S ear HRTF may be described by, for exanqile, a set of 10 eigenvalues. 

In this embodiment, the chister analysis procedure performed by the HRTF clustering 
processor 73, shown in Figure 6B, is perfbnnedusiqg a hierarchical agglomerative cluster technique, 
for example die S-PhisO program, provided by MathSoft, Inc. , based on the distance between eadi 
set of HRTFs in multi-dimeQsion space. Each subject-ear HRTF set is represented in multi- 

10 dimensional space in tenms of dgenvafaies. Thus, if 10 eigenvalues are used, each subject-ear HRTF 
would be represented at a specific location in lO-dimensional space. Distances between eadi 
subject-ear position are used by the chister analysis in order to organize the subject-ear sets of 
HRTFs into hierarducal groups. Hierarchical agglomerative clustering in two dimensions is 
ilhistrated in Figure 14. Figure 15 dq>icts the same clustering procedure using a binary tree 

IS structure. 

This embodiment stores s^ of HRTFs in an ordered fashion in the ROM 65 based on the 
re^oftfae chister analysis. According to the clustering approadi to HRTF matdiing, the present 
invention employs an HRTF matcfaiqg processor 59 in order to allow the user to select the set of 
HRTFs that best match the usa. b an exenq>laxy embodiment, an HRTF binary tree structure is 

20 used to match an individual listener to tfie best set of HRTFs. As ilhistrated in Figure IS, at the 
higMtlevd 48, the sets ofHRTFs stored m the ROM 65 conqirise^ Atthenext 
higjiestlevd 49, SO, the sets of HRTFs are grouped based on shnilarityin^ The 
listener is presented with sounds filtered using rqiresentative sets of HRTFs fifom each of two sub- 
clusters 49, 50. For each set of HRTFs, die listeno* hears sounds fiherai using specific HRTFs 

25 associated with a constant low elevation and varying azimuths surnnmdu^ Thelistener 
indicates whidi set of HRTFs appears to be originating at the lowest elevation. This becomes die 
current''bestmatchsetofHRTFs.'' The cluster in which this set of HRTFs is located becomes the 
current ""best matdi dusta.** 

The ''best matdi chister^ in turn inchides two sub<lusters, 51, 52. The listeno* is again 

30 presentedwitharqirBsentativepairofsetsofHRTFsfromeachsub-clu^ Once again, the s^ of 
HRTFs that is perceived to be of the lowest elevation is selected as the current ''best match set of 
HRTFs'* and the chisia in which it is found beocHnes the cur Theprooess 
continues m this fashion wxdi each successive chister containing fewer and fewer sets of HRTFs. 
Eventually the process results in one of two conditions: (1) two groups ccmtaining sets of HRTFs 
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so similar that thm are no statistical significant dififermces within each group; or (2) two groups 
attaining cBily gdq sei of HRTFs. The represratative set of HRTFs selected at this level becomes 
die listeoei's final ""best matdi set of HRTFs.'' Fnm diis set of HRTFs, specific HRTFs are selected 
as a function of die desired phantom loudspeaker location assodated with each of the multiple 
5 channels. These HRTFs ate routed to muhipleHRTF processors foconvdut^ 

Refisni^ to Figure 7, left 701 and right 702 audio out signals of Figure 23 A ((»^ 30 and 31 
of Figure 4), can be inputs, for exanqple 754, of a typical digital signal transmission system known 

10 in the art, the output of which, fen* example 762, can be inputted to a set of headphones. 

Left 701 and right 702 audio out signals (or 30 and 31 of Figure 4) can be ou^utted in 
digital or anatogfonnat Ifoulputted in analog fcnnat,eadi signal can be converted to digital forn^ 
755. In a preferred embodiment of dns invention, after convcrsioo to digital format, the left and ri^ 
audio signals are intsUued in time to create a single digital signal 755 which carries both the left and 

IS right channel information. Fot example, die single interlaced digital signal 755 can have a first 
digital wonl,e.g., 16 bits, that is a ii^audbduuinelwc»r4 a second digital won^ 
dunmel word and thereafter ahemating brtween ri^t and left (see Figure 9G). This single digital 
signal 755 carrying both the left and right audio channel infotmatim can then be inputted, for 
example 755 of Figure 7, to a typical digital signal transmission system. 

20 A standard digital signal transmission system, as shown in Figure 7, typically conapnses a 

transmitting station 751, a connecting medium called a dumnel 752, and a receiving station 753. 
The transmitting station 751 can receive an analog signal 754 and convert it to a digital signal 755 
or can receive a digital signal 755 directly. Conversion of an analog to a digital signal, for exanqple 
using an analog-to-digital (D/A) ccmvert^ 756, requires the analog signal to be sampled and 

25 quantized to the nearest of a number of diso^ signal levels. The discrete signal level of the 
quantized signal is sent to a source encoder 757 where eadi discrete signal levd is converted 
digital representation dneof^ typically binary. This representation can consist ofdigital words, for 
example 16-bit digital words, wherein each digital word represents the value of a discrete signal 
teveL These digital words can be transmitted sequentially as a serial binary digital bit streant The 

30 binary digital rqmsentaticm is in a particular waveform fonnat, e.g., unipohff or Manchester, and 
is sent to a modulator 758, which modulates the signal for transmission over fee channel 752. For 
instance^ the modulator 758 can be a RF modulator, for whidi the corresponding channel would be 
air. Altenuttivdy,thedianndmaybeawireorliketrBnsinissionm^uis. The receiving station 753 
is essentially the invme of the transmitting station and comprises a demodulator 759, a scnvoe 
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decoder 760, ami an optiooaldigitako-a^ The output frcm the receiving station 

can accordingly be either an analog output 762 or a digital ouQiut 763. 

Examples 

S hxqxsrtant parameters and design considerations for a digital signal transmission system are 

bandwidlh of the channel, costs of the transmitting and receiving stations^ power consumption of the 
transmittii^ and receiving stations, and the particular binaiy wavefi^ 
Bandwidth is importam because it liinits the anoount of information that can 
The sdection of the binary wavefonn is important because the sdection can afife^ 

10 the costs, complexity, and powor consumption of the transmitting and receiving staticms. This 
exanqile provides a method for signal transmission that avoids certain problems, discussed below, 
inherent in known iransmissioQ systems for digital signals vMA enhances the fidelity of the HRTF 
processed signal of this invention as it is sent to a listener. 

Where a receiver, for exanqde, within the receiving station of Exan^le 2, has no dock which 

IS is, a priori, synchronized to an incoming digital bit stream, the digital bit stream is called an 
synchronous signal. When an asynchroooiK binary format digital bit stream is rece^ 
must, therefore, lock-cm to the bit rate in order to generate a clock signal, tied to the bit rate, to 
enable the receiver to decode the signal Loddng-on to the bit rate can be accomplished by known 
methods, for example, using a phase-k)cked loop (PLL). However, there can be difficulties in 

20 locking on to the bit rate when receiving digital audio signals rq>resented in binary format, (e.g., 
two's conqilementX yibidi are often dominated by rq>eated strings of contiguous zeroes and/or ones . 
For exanqile, these sQings of contiguous zeroes and/or ones can be encountered with audio signals 
during moments of silence, or idle patterns. These strings of contiguous zeroes and ones can lead 
to drifting of die output frequency of the PLL due to an imbalance in th^ 

25 events within the PLL When the ou^ttfiequemy of the PLL drifts, the PIX can lose its k)^ 
resulting in decoding mors, and thus d^ntdation in the performance of the entire transmisskm 
system. In contrast, a binary finnua digital signd without rq)eated strings of omt^ 
and/or ooes wadd give the PLL a balance of charging and discharging events, allowing the PLL to 
track the digital signal's frequency more accurately. 

30 Existirig sohitions £or eliminating the drifting of die PLL*s lock-in frequent due to rq)eated 

strings of contiguous zeroes and/or ones have required additional bandwiddi or complicated, 
expensive hardware. For example, Manchesto*, or bi-phase-level encoding, commcmly used finr 
digital audio signals, eliminates die driftiiig of die PLL. A Manchester encoded waveform transmits 
the symbol 1 as a positive pulse fiar half of the symbol interval, followed by a native pulse for the 
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remainder of the interval; die symbol 0 is convoked by the same two-pulse sequence but of opposite 
polarity. Therefore, using Manchester encoding, even with binary fonnat digital signals having 
rq)eated string of zeroes and/or ones, receiver clock timing can be extracted without drifting of the 
PLL by piovidii^ a charging and dischaigine event for the PLL in the fonn of a signal transition for 

S eadi bit received. Aoooidiqgly, the Manchester encoding technique allows the PLL to easily lock-cm 
to diese regular signal transiticms. Unfortunately, Mandiester »coding requires about twice the 
bandwiddiofother encoding tecfaniq^ such as unipolar and bipolar sig^ Additionally, odier 
techniques which have not required as much bandwidth as Manchester have also been enqilpyed. 
However, these techniques are more complicated and therefore more costly to encode and decode. 

10 This exanq)le provides a no^ solution to these pnoblems and provides a method of efiGcirat 

carrier stabilization and bit clock embedding. In a specific embodiment, the subject invention 
includes a novel encoding, transmission, and decoding technique for binary fonnat digital signals. 
This is pardculariy advantageous ^plied to signals with frequent idle patterns (e.g. digital 
audio). Advantages of the subject technique include efficient carrier stabilization and bit clodc 

IS embedding. In addition, this tedmology provides a low-cost, low power-consumption 
transmitter/receiver combination for digital signals, indudmg, but not limited to, digital radio 
frequen^r (RF) audio signals processed according to this invention to spatialize sound over 
headphones.. 

The sid)ject encoding tecfaniq^ can opoate on iqnit binary eno 

20 encoded in two*s coaq)liment. The subject tedmique involves (a) removing the DC component of 
die iiqnit binary encoded digital signal, if present, and, ifnotalreacfy present, adding a small amount 
of noise to the input binary encoded digital signal, to ensure that each bit locaticm undergoes 
transitions between the zero and one states, even during idle patterns; (b) inverting, or toggling, 
every other bit of the binary encoded signal to provide sufficient transitions between adjacent bits 

25 to oable the receive to lock-on to the bit rate and to prcv&st drifting of the receiver's PLL whoi 
long strings of contiguous zeroes and/or ones are present in the input bixuuy encoded digital signal; 
and (c) encoding a locking bit on the digital signal, for example one locking bit at the start of each 
word This locking bit enables the receiver to lock-on to the word pattern of the digital signal, Le., 
thepositionofthedigital words within the digital bit stream. In addition to having httle or no DC 

30 component, the signal shouki have enough self-noise to ensure frequent transitions from positive to 
n^ative values of die signal. Note, ifa signal does not have sufficient self-noise, a noise generator 
is summed widi the signal to ensure frequent transitions between positive and negative values tor 
the signal. 
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The subject encoding tedmique operates on an input binaiy encoded digital signal, typically 
encoded in two's ocHx^lemeDt The fim step ofdie subject tecMque is to ranove the DC com 
of the input binaiy encoded digital signal, if present. Since the DC can^x>nent of the signal is 
removed, this technique is q>plied to signals vAxx^ DC coiq>liqg is not critical, as in the audio signals 
5 of this invention. Since the human car cannot detect DC sounds, the DC cQmponmt is not 'mjityr^^t 
with respect to digital audio signals. Therefore, this technique is particular^ advantageous with 
respect to processing digital audio signals. 

With refisrence to Figure 8A, the left 701 and riflit 702 audio out signals (or 30 and 31 of 
Figure4)canbeoutputtedindigitaloranak^finna^ Ifou4)utled in analog fmnat^eadi signal can 
10 be converted to digital fomiat 901. In a pre&ned embodiment ofthis invention, after convcrsiaQ to 
digitd forauU, the left and right audio signaU are interlaced in time to create 
901 vriiich carries both the left and right channel infmnaticm. For exanqile, the single interiaced 
digital signal 901 can have a first digital word, e.g., 16 bits, that is a right audio channel wcml, a 
seccnd digital word that is a tefi audio diannel word and thoeafter alternating between right and left 
IS (see Figure 9G). This single digital signal 901 carrying both the left and right audio ctiami?! 
informaticai can then be iiq>utted as shown in Figure 8A. 

It is prcfened that the DC be removed 902 fixsm the signal after the signal is in digital fonn 
901, rather dumfiom the analpg signal prior to digitization. When <»ieattenvts to remove the DC 
oompooent of an anak^ signal before d^gitizatioQ, a smaU DC cc^^ 
20 the digital signal duriqg conversion from analog to digital. This DC conqxment introduced into die 
digitd sigiud is inherent in Imown analog-tOKligital converter 

when implementiqg the subject inventicKL For instance, during idle patterns of the signal, this 

residual DC component can cause bit locations to ''stick'' (Le. remain in a zer^ 

for lopg periods. lUs'^stickii^ can make it possibte for the receiver to mistake a ''stid^ 

25 a locking bit, wAiidi as discussed in greater detail bdow, is a btt 
sigmd and, typically, is always a zero or always a one. 

Removing the DC component 902 can be acconqilished by many known techniques, for 
exanq)le, by passing the signal through a hi^ pass digital filter. This high-pass filter can be, for 
exanQ)]e, an infinite inq)ulse response (IIR)hi^ pass digital filto*. It is important, whm designing 

30 die apparatus which is to remove the DC component from the digital signal, that the apparatus does 
ixA detrimentally affect the non-DC Gompooaiits of the digital signal In a specific embodiment, a 
firstKinierButterwoidi digital h^^hpass filter, wi& Inaprefened 
embodiment, an adaptive filter is used to remove the DC oomponent. 
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In a preferred embodiment, an adaptive filter such as that shown in Figure 8B is used to 
remove the DC component 902 of the input binary encoded digital signal 901, gmerated by 
interlacing in time the digital format representation of left 701 and right 702 audb out signals of 
Figure 23A (or left 30 and ri^ 31 earphone signals of Figure 4). For clarity we can define the left 

S chamid viwds within 901 as 9011 and the right cfaanndwor^ 

digital signal, in a specific embodiment, can be a 16 bit word signal where left and right channel 
words are interlocked m time such that the first 16 bit word rq>resents the first rigitt channel word 
and the seoGod 16 bit word rqxresents the first left channel word. Accordingly, each successive 16 
bit word alternates brtweenri^diaanel and left channel, b this case, when removing the DC 

1 0 component 902, it is required to separately remove the DC from the right channel 901 r and the left 
channd 9011, due to the indqiendenceoftheri^ channel and left diannel signals. Therefore, the 
ri ^t 901 r and left 901 1 duumels are split apart to be operated on independently for removal of the 
DCoonq>onent902. 

For clarity of discussion, die processing of the left diannel 9011 will be explained, noting 
IS thattterigbt channel 901r undergoes the same processing independently. Refienring to Figure 8B, 
the digital word ofthe input signal 9011 is first summed 771 with a tracking constant C[k] 772, 
which can initially be zero. The sum 773, which is also the output ofthe adaptive filter, then is 
compared to zero 774, for example, by observing the sign bit ofthe word. Ifthe word is less than 
zero 775, die tracking constant C[k] 772 is increased by a stq> size Q2 776, C(k+1] » C[k]-K)2. 
20 Alternatively, ifthe word is greater than zero 777, the tracking constant C[k] 772 is decreased by 
a step size Qi 778, C[k+l]=C[k] - Q,. The tracking control variables, Qi and Q2, are dependent 
upon the amount of gain desired in the adaptation control circuit. This adaptive Gha effectively 
integrates out an average, or DC conq>onent, and continually removes it fixnn die source signal. 
When the iiq)ut signal 9011 or 901r has sufficient self-noise to oisure transitions betwca 
25 positive and negative values even after the DC conqKment is removed, then it is preferred that Qi 
and Q2 be equal in size. biadditicn,re&mng to Figure 8A, ifthe input signal 9011 or 901r does not 
havft giiflRrii»nt M^ifit ^^f^ ^ ^ iviige ge ne r ato r 924 cm he used to add in sufficient noise. Inapreferred 
embodiment, if the input signal 9011 or 901 r does not have sufficient self-noise, the adaptive filter 
of Figure SB can be used to bodi remove the DC conqxuiem and add in sufi^^ 
30 by having Q, »2Q2. In this finbodf"g«^, i»P"t signal 9011 or 901r having a DC component of 
zero, with no noise, wouU fust be increased by Q2 to a value of Q2, then would be dec^^ 

2Q2 to a value of *Q2, then be increased by Q2 to a value of zero, and thus repeat duxnigh these 
values. This osures that eadi bit location und^goestransiticms between die zen) and one state 
even duriog idle patterns. 
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Referring to Figure 9A, 9B, and 9C» the results of a oonqputer simulation of ranoving the 
DC component finom a gaussian noise source using an ad^tive filter, as shown in Figure 8B, are 
iOustrated. In this simulation, a gaussian noise source with a variance of 2,5 mV and a mean of 0.5V 
is innoduced to the adaptive filter. For this simulation, a value fat both Q, and Q2 of 0.488 mV is 

5 used Figure 9A shows the original gaussian noise source waveform. Figure 9B shows the v^^ 
die tracking constant, C[k], and Figure 9C shows the ou^ut wavefonn of the adaptive filter. These 
plots are over 2048 sanq>les or about 52 msec. Ihe output wavefioim clearly has the DC componmt 
removed in the latter half of the plot 

Referring to Figures 9D and 9E, the magnitude fitquenQ^ response of the iiq>ut gaussian 

10 noise waveform and DC shifted output waveform are shown, where Figure 9D is iq> to 2x10^ Hz 
while Figure 9E shows an expanded view iq> to 1 000 Hz. 

Once the DC conqsoneot has been removed, the next step is to toggle every other bit 903 
ofAesigDaL Has tpggliiig can be aooomplished by known means, for example, by exclusive ORing 
the signal with a sequence of altenuOing ones and zeroes, i.e., ...1010... 10... The output of an 

15 exclusive OR gate is a one if^ and only if, only one ofthe two iiq>uts is a one. Therefore, when an 
input is exclusive ORed with a zero, the output is the same as the iapvL However, when an isxpixt 
is exchisive ORed with a one, the ou^ut is an inv^cm of the input. For example, a one exclusive 
ORed with a one gives an output of zero and a zero exclusive ORed with a one gives ancnitput of 
one. Referring to Figure 8A, in a specific 16 bit embodiment, eveiy other bit of the encoded signal 

20 is inverted fay exdusive ORing 903 each word at the signd with 1^^ Itshouldbe 
noted diat one could alternatively exchisive OR the signal with 010101...01 and adjust tte receiver 
accordingly. The purpose of this to^^ling, or inverting of evoy other bit, i^ 
traiisitioos between adjacent bits to enable a receiver to lock^m to the bH In combination, the 
removal of the DC component, and subsequent inverting of every other bit, ensures t^ 

25 not be rqieated strings of contiguous ones or zeroes, and tbat each bit location is guaranteed to 
alternate, or flip flqi, between the one and zero states, even during idle patterns of the signal. 

To illustrate, in a spcdSc embodunent, 24 bit signed two's complement encoding is used. 
The most significant bit location is the sign bit m the two's complement binary format, vtoe the 
sign bit is zero for positive and one for negative signal values. Since the DC component of the 

30 digital signal has been removed, the digital signal frequently transitions between positive and 
negative. Therefore, the sign bit location is equally likely to be a one or a zoo. Combining the 
removal of the DC CQoqx]nem with the inversion of every other bit ens^^ 
bit locations in this 24 bit iUustration are also just as likely to be a one or a zero, and diere are no 
rq>eated strings of contiguous ones or zeroes remaining in the signal. 
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By oootrast, even when the DC component is removed, if eve^ other bit were not invoted, 
the 24 bit signal would frequently have positive value words having a string of zeroes in the most 
significant bits during idle patterns, such as 000000000000000000100101, widi aofy the least 
significant bits beiiig in a difGaem state than dx£a^ Likewise, there would also be many 

5 negative value words, with a string of ones in the most significant bits such as 
111111111111111110101110,againwidionlydie least significantbte If the signal, 

fior example due to noise, were such that the signal remains positive or native for relatively long 
perkxls, dien diese most significant bits can ''stick** at a particuhff value, zero or one, for an equalfy 
tong period. These *'stidd]^" bits oouU be niistakn for a locking bit, vrfierein a lodd^ 

10 vrtiicfa can be encoded on the digUd signal and, typically, is alwa^ Alocking 
bit can be located at a certain Ut location withm a word to allow a receiver to lock-on to the location 
of the words within the signal by locking on to the loddng bit. Howevo-, according to the subject 
invention, afi^ exclusive ORing the signal with 1010...10, 000000000000000000100101 is 
converted to 101010101010101010001111 and 111111111111111110101110 is converted to 

IS 010101010101010100000100. Thmfore, after exclusive ORing the signal with 1010 ... 10, it is 
ensured ttiat die PLL will receive a balanced numbg of diarging and discharging evmts as well as 
numerous transitions at die bit rate» thus allowing the PLL to stay locked-on to the bit rate. 
Additkxudly , the noise on the signal, sufiQcient to ensure transitions between positive and native 
vahies of die sigpal, ensures diat no bit will "stick** in a certain state for too long even during idle bit 

20 patterns. 

A "'code violation'' within the signal can be used to alk)w the recdver to detent 
each word begins. In order to provide this code violation, a loddqg bit can be placed at certain 
locations within die signal. Fot exanq)le, in an audio signal, rigjit and left channel words can be 
interlodced in time, where eadidiannd can have, for example, 16 bits as shown in Fig^ In 

25 diis case, die kxking bit can be located in a certain position of die ri^ diannel vfocd, for example, 
in die least significant bit k)catioa Thiskxkingbitdiengivesdielocationof die rig^t channel wend, 
as wdl as the kicatioQ of the left channel word. This locking bit can be, for example, always a zero 
or ahvi^^ aoQe,v^iidiaUows areceivertobckon to the locking bit and, therefore, the word pattern 
of die digital bit stream. In a specific 16 bit wonlembodimfflt,afia removing die DC and exclusive 

30 ORing vindil010...10, each, for exanq)le,r|g^ This 
ANDoperadonleavesthefirst ISbitsoflfae 16 bit word unchanged and necessarily encodes a zero 
indie 16di bit kicaticm. This guarantees dutt each right word has as a locking bit, a zero in die least 
significant bit location, to allow detennination of the focationofeadi word m the digital signal at 
diereo^er. U is inqxntant to note that it is not necessary for eadi word or even every oi^ 
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to have a loddng bit encoded OQ it. Indeed, a locldng bit could be encoded cm every third or fo 
wxd h&ct,tte limit as to how fiv apart locking bits can be ^ 
conq)lexity of the receiver to be used 

Oooe processed as described above, the signal can be transmitted via a wired coimection to 
5 faea^hones or through the air. In a specific exanq)le, referring to Figure 8A, for wireless 
transmission, the signal is iiq)utted to a fiequency shift keying (FSK) transmitter 90S, such as a 
RF9901 FSK transmitter chip fiom RF Micro Devices, whidi nxxiulates the signal for transmission 
from a transmitting loop antenna 906. A conesponding receiving loop antenna 907 receives the 
inoaming FSK modulated sigqal and sends the signal to a FSK receiver 908, such as a RF9902 FSK 

10 receiver chip from RF Micro Devices, vrfiidi demodulates the s^ The demodulated signal can 
then be inputted to conventional two transducer headphones for Ustoiing. 

The receiver should be abb to lode on to the bit rate and then lock on to the loddng bit m 
Qixfer to decode the agnaL Refisning to Figure 9F, the receiver can cmiprise a phase lock loop 815, 
^^^lich provides a master dodc 804 and aligns the cloddng bits with the data bits provided from, for 

15 example, an RF demodulator. The recdver can furAer comprise a state machine 800, which can be 
the center of the timing for the recdvo*, and can also perform a number of operations induding: 
cloddng fimcdons for the D/A converter, redoddng of the data delivered to the D/A, and control 
lines for mast^restt. The state niadiine can provide a 8eridck)ck 80S, SCIJC, a lefi/rig)it dock 
806,iyRCLK,anddata8Q3,SDATA,toaD/AcQavertBr. The state madune 800 can, for exanq>le, 

20 be a free running d|^ bit counter. Where die dgnd is transtnittedwirelessly, the state machine 800 
reodves die RF data 801 (RF Digitd) and inverts the bits whidi were inverted prior to transn^ 
by exdusive ORing RF Digital 801 with a docking signal Q3 802 which has a frequency one half 
of the bit rate (or 1/16 of Oie master dodc). The data stream can then be latched to produce a strong, 
dean data bit stream, 803 (SDATA), to present to the D/A converter. 

25 The locking bit is encoded on the incoming data stream, RF Digital 801, to allow the 

receiver to maintain word lock The loddng bit can be, for example, always 0 Oogic level low) in 
the least significant bit of the digital data word The state machine 800 looks far the loddng bit 
during a window of time, the loddng bit vrindow 808, to determine if lock is being maintained. If 
a 0 is present, no acdon is taken; however, if a 1 is detected, the state machine 800 resets itself via 

30 itsresctooDtrdline809. After resetting, the state madiine 800 can, fior example, s 

data position and the process continues untU lode is regained. It should be understood that the 
h)ckingbitcouklalwaysbe 1 and dien the state machine would reset upon detecting a 0 during the 
loddng bit window 808. 



wo 97/25834 « PCT/US97/00145 

40 

In a ^)ecific embodiment, returning to Figure 8A, the demodulated signal output fixun the 
FSK reoeivo' 908, called RFDIG 801 , is in the same binary fomiat as the signal which entered the 
FSK transmitter 905. hi order to decode the signal, it is ii^)utted to a phase-locked loop (PLL) 81 5 
and also iqnilted to an exclusive OR gate 917 to be exclusive ORed with 1010... 10. The PLL 815 

S is able to lock on to the frequency of the bit rate due to sufiGcient bit transiticms provided by die 
exclusive ORing of the signal with 1010 ... 10 prior to transmission, viiich provides a strong 
frequency component at the bit rate and provides the PLL 815 a balanced number of charging and 
discharging events. The output of the PLL 815 is the master clock 804, MCLK, vAndi has a 
frequocyeigjit times the bit rate. The MCIiC is inputted to a divide-by-^ight state niachine 912, 

1 0 with die output thereof^ at a frequency equal to the bit rate, fed through a feedback loop 91 3 to the 
PLL 815 and fed to latdi 916. AdditionaUy, MCLK 804 is iiq>utted to a state machine 800 which 
generates clock signals at MCLK/2 (or QO)810, MCLK/4 (or Ql)811, MCLK/8 (or Q2)805, 
MCLK/16 (or Q3)802, MCLK/32 (or Q4)812, MCLK/64 (or Q5)813, MCLK/128 (or Q6)814, and 
MCLK/256 (or Q7)806, wlnein MCLK/2 means a clocking sigqal at the MCLK frequency divided 

IS by2,etc Figuxe9G show how dieseckxk signals aUgn with each 

801, the output of exclusive OR gate 917, XOR output 816, and the output of latch 916, SDATA 
803. 

Figure 9G shows two 16-bit words, ri^t channel word D15, D14, ... , DO, and left channel 
won! D15, D14, ... , DO, from a digital bit stream, RFDIG 801 m Figure 8A. Note, these two 16-bit 

20 wonlsoouki be considered one 32*bitw(ml. In this embodiment, die first D1S,D14,..., DO can be 
a right diannel word and the next DIS, D14, ... , DO can be a left channel word. MCLK/8 (or Q2 
805) is referred to herein as SCLK, the data clock at twice the bit rate, \^ch can be used to 
detormme the state, one or zero, of each bit To lock on to the locking bit, located at DO of the rigjit 
channel word, an eight input NAND gate 915 with inputs NOT Q7 817, Q6 814, Q5 813, Q4 812, 

25 Q3 802, NOT Q2 818, NOT Ql 819, and a bit value from latch 916, SDATA 803 after inversion, 
922,isused. Latdi 916 can delq^eadi bit for (me cydeofMCIiC/4, or one-half the duration of 
bit Therefore, the ou^mtfrcmi latch 916, SDATA 803, is ddayed with respM 
exclusive OR 917, by one-half the duration of a bit This latdiing and delay allows the bit to be 
clean and strong during the loddqg bit window 80S. Figure 9G illustrates the alignment of SDATA 

30 803, and the various dock signals when the state machine is in lode with the locking 

However, before attaining lock on to the locking bit, the bit value during the locking bit 
window 80S, one or zero, bom latch 916 is die bit value of Dn, whidi is aiq^ one of DIS, D14, ... , 
DO, DIS, D14, DO from eitbor the left or ri^t channel word as shown in Figure 9G. The bit 
vahie ofDn is obtained by Exdusive ORing 917 RFDIG 801 with Q3 802. Exclusive ORing 91 7 
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Q3 802 with RFDIG 801 inverts the previously inverted bits to generate a data signal, XOR ou^ut 
816, v^iiich is a replica of the original Innaryc^^ Q3 802 

is qiichnsmzed with RFDIG 801. by locking on to Aft^ the PLL 815 has locked on to 

the bit rate, die tocking bit is kxated by fiist resetting the state 

S the two 16 bit word Qde. Iftfae output 921 of the NAhH) gate 915, after inversion by invoter 92^^ 
is a zen^dien the selected bit is a one and therefore not the lodd^ Alternatively, the inverted 
HAND gate 915 ou^nit 921 willbe <me (xily i;riien the inverted bit 922 fix>m SDATA 803, is a <Hie, 
correspon d ing to the bit firon SB ATA 803, the locking bit, being a zero. The inverted N AND gate 
915output,921tCanoQtybeaoaeifdieinvertedbit922 fixm SDATA 803 is ^ 

10 thatNOTQ7 817isaone,Q6814isaone,QS813isaone,Q4812 is a one, Q3 802 is a one, 
NOT Q2 818 is a one, and NOT Ql 819 is a one, based on the inputs to theNAND gate 915. As 
can be seen from Figure 9F, this only occurs at the DO bit location of the right channel word. 
Thermae, if Dn (n^O) is arriving when DO should arrive, then the inverted NAND 915 output 921 
remains zero until Dn eventually becomes a zero. 

15 li; in Figures 8A and 9G,Dn is a one, tfam die inverted NAND gate 915 out^ 

and the state madiine 800 can be instructed to reset to the bit foUowingDn,na]^ Since 
each bit location fiom D15, D14, .... DO, DIS, D14, .... DO is guaranteed to alternate between one 
and zero, except the hxddiig bit, DO of die ri^ cluumd word 

can quickly k)ck on to the location of the locking bit In this synchronized state, lock-on to Ae 
20 kxkii^ bit has been adueved. T1» need to k)cate the loddiig bit is why it is inoperative that each of 
the other bit lof ations an? giiarantmi to tn « iwe crute samt time in the bit stream such that 
no other bit location remains in the zero state long enough to be mistaken as the locking bit 



Example 4 

25 In an embodiment such as described in Example 2 or Exan^)le 3, if the digital signal is 

wirdessly transmitted throu^ the air, for example from an FSK transmitter to a FSK receiver, the 
rscdver can be bcated in a remote unit \^le the transmitter can be 1m Thebase 
unit can, for example, comprise the HRTF processing circuitry including DSP chip 600, EEPROM 
710, and Exienial EPROM 704, sudi as exenq>lified in Figure 23 A, as well as the signal processing 

30 circuitry 901, 924, 902, 903, 904, FSK transmitta 90S, and transmitting loop 906, sudi as 
exemplified in Figure 8A. The remote unit can, for example, comprise receiving loop 907, FSK 
recent 908, PLL 81 5, state madune 800, NAND gate 800, and associated cimiitiy exemplified in 
Figure 8A, as well as input means for HRTF matdung control 636, OK control 637, Noise control 
703, Bass ccHitrol 680, Ears control 629, Seat control 643, Ambience oontiol 696, Tbeater ccHitrol 
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624, Hall comiol 625, and Club control 626. Altonatively, the input means for the aforemCTtioned 
control functions can instead be located in the base unit The headphones can be plugged into the 
baseunitortherenioteumttoaUowtheheac^honeusertolistratotheau^ Thewixeless 
transmission of the signal from the base unit to the remcto unit allows the listener a greater ra^ge of 
S motion than ifconnected to the base unit 1^ wire. If the input means for the control features are in 
the reatote unit it is preferred to have some means tat the lemtte miit to send infftr mi^t '^ to thr base 
unit. 

In a spodBc embodiment, the remote unit sends information to the base unit, for exany le, 
by an infiaH«i(IR) signal Specifically, the remote unit has input means, for example, buttons, for 
10 die listener to enter, for example, club 626, hall 625, theater 624, ambi»ce 696, seat control 643, 
ears control 629, bass control 680, noise control 703, OK control 637, and/or HRTF matching 636 
signals. These conunand signals are transmitted to the base unit by, for example, IR. 

In order for the remote unit to determine if the base received the IR signal, the base sends 
a return signal fiom the base unit to the remote unit, in response to receiving the IR signal fiom the 
15 remote unit In a prnfemed gmhn^iTT^«t jjia^ invmtiffl g^Hffllff? tflg b't ffn thf RF 4*ffi^} 
audio signal which» when received by the remote unit, indicates rcodpt^ 
signal fitm the remote unit 

This tag bit is a bit encoded similariy to the loddng bit For exanq)le, if ttie locking bit is 
cpooded in ifae least Mgniffeam bit location of the right channel word of the audio signal, then Ae tag 

20 hit is, fiy example, encoded m the lensf ^igpiificflnt hit loearion nf th^ fthnimgl vmtl of the m i dio 

signal. In a prefened embodiment, the tag bit is encoded, as a default value, opposite to the value 
of the loddzig bit For instance, if the locking bit is encoded as one, ot a zero, then the tag bit will 
be encoded, as a defiudt value, as a zero, or a one, respectively. In a specific embodiment vAiere the 
loddng bit is encoded as a zero, the default value of die tag bit can thus be a one and can therefore 

25 be encoded by ORing each left channel word with OOOOOOOOOOOOOOOI. 

In operadon, the receiver in the remote unit interpr^ a one in the tag bit location to mean 
that no IR signal has beoi received by the base unit When the base does receive an IR signal fiom 
the remote unit, the base unit encodes a zero value in at least one consecutive tag bit location by 
ANDing at least one left word with 1L..10 instead of Qring with 00...01. In a prefened 

30 embodiment, a zero value is encoded fig eight consecutive tag bits to reduce the effects of noise» ie. 
bit errors. 

The state machine 800 nMnitocs the tag bit location, which is known relflrivg tn the i(>cWpg 
bitlocadon. In a preferred cmhodimmt, the kxJdng bit is encoded in the least-significant bit location 
of the right channel word and the tag bit is encoded in the least significant bit location of the left 
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channelwQrd In t^^g embod'w^t, the receiver nf the remote unit nkmitors the tag bit much like it 
momtcrstfaeloddngbit F<x- example, an aMtional inpia N AND gate similv 
915 having inputs Q7 806, Q6 814, QS 813, Q4 812, Q3 802, NOT Q2 818, NOT Ql 819, and a 
bit vahie fitm latch 916, SDATA 803, after inversion, 922, is used Note, these are the same inputs 

S for monitoring the locking bit location, except NOT Q7 817 is iq)laced with Q7 806. Figure 9F 
illustrates die alignment of SDATA 803, and the various dock signals when the state machine is in 
lock with the loddng bit 

If the inverted output of die NAND gate is a zero, then the tag bit is a one ^ 
IR signal has been lecdved by Ae base. Alternatively, the inverted ou^iutoftfaeNAND gate will 

10 be a one only when the inverted bit 922 fiom SDATA 803 is a one, coirespcni^ 

SDATA 803, the tag bit, bdng a zm>. A zero value for tfse tag bit signifies the base unit has 
recdved an IR signal from Ae remote. 

The ^ate machine 800 only looks for the tag bit during a snoaU vvi^^ 
window 820, after a command is sent via the IR link. The remote dears the tag bit latch, transmits 

IS die commai^ wad over the IR, and then watches for a zmbit to be latched onto the tag bit control 
line. Ifa zero is latched, tha the conmiand was recdved by the DSP, the base; ifa one is latd^ 
thenthecoomiandwasnotreodvedandnoactkmistakenby thereniotem When a one is latched 
fln^ fin if^jrirm k tuVgn 1y the femnte^ the iiser would be requirBd to press the command button again 

andresendthecommandoverdielRlink. Once the recdvo^ locks on to the locking bit, the locatum 
20 of the tag bit will then be knowa 

It should be understood Aat the exaoqiles and embodim^ described herein are foe 
iUustrative purposes only and that various modification or changes in li^ thereof will be suggested 
to persons skilled in the art and are to be included within the spirit and purview of this application 
and the scope of the appended daims. 



25 
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1 1. A method processing a signal compru 

2 least one channel has an audio component, wherein said mtthod allows a user of headphones to 

3 receive at least one processed audio component and perceive that the sound associated with each 

4 audio component has anrived fiom one of a plurality of positions, determined by said processing, 

5 wherein said m^od comprises the steps of: 

6 (a) receiving the audio componmtofeach said at least one channel; 

7 (b) selecting, as a functim of a user of headphones, a best-match set of head rd 

8 transfer functions (HRTFs) fiom a database of seU of HRTFs; 

9 (c) processing the audio component of each said at least one channel via a 

10 CQnespondix^ pair of digital filters, said pairs of digital filters fUtering said audio 

1 1 components as a function of the best-match set of HRTFs, eadi corresponding pair 

12 of digital fillers generating a processed left audio conqxment and a processed right 

13 audio componoit; 

14 (d) combinine said processed left audio oomponBnt from each said at least one channel 

15 of the signal to form a cooqiosite processed left audio component; 

16 (e) combining said processed right audio conqxsnent fiom each said at least one 

1 7 channel of the signal to form a conqxmte processed right audio component; 

18 (f) applyiiig said conq)osite processed left and right audioes 

19 to create a virtual listemng environment wiierein said user of headphon es perceives 

20 that the sound associated with eadi audio conqKment has arrived from one of a 

2 1 plurality of positions, determined by said processing. 

1 2. The method, acomling to claim 1, wherein said database of sets of HRTFs is 

2 generated by measuring and recording sets of HRTFs fiom a representative sample of the listening 

3 population. 

1 3. The method, according to daim i, whoein each position of said pluraUty of 

2 positions is predetermined and corresponds to one of said at least one channel. 

1 4. TheineifaodacoQrdiqgtoclaim3,^mn,afierthestepofsdectinga best-nu^ 

2 set of HRTFs, said medxxifiirther comprises the stq> of selecting a position subset of HRTFs fiom 

3 the best-matdi set of HRTFs, each of the selected HRTFs of said subset of HRTFs being sdecled 

4 so as to correspond to a virtual position closest to one of said predetomined positions so that the 
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5 user of said hoidpty^n^ perceives that the swnd associated vith each said at least one channel 

6 originates from or near to said corresponding predetennined position. 

1 S. The method aocoidiqg to claim Ufiuthercompr^ 

2 steps: 

3 (a) processing the audio component ofat least one ofsaid at least one channel of the 

4 signal via a bass boost circuit prior to processing said audio ocxnponent of said at 

5 least one dumnel via the pair of digital filters; 

6 (b) prior to applying the c(»nposite processed left and riglu audioes 

7 headphones, fiirdierprooessmg die ocxnposite processed 1^ 

8 the composite processed rigttt audio conq)onent via an ear canal resonator drcm 

1 6. The method according to chum 1 , wherein said audio compmient of eadi said at 

2 least one channel of the signal is processed sudi that said predetennin^ 

3 a Dolby Pro Logic® audio component. 

1 7. The method, according to claim 1, further comprising the stqis of: 

2 (a) collecting a database of measured HRTFs; 

3 (b) ordering said database so that a representative subset of the entire collection of 

4 HRTFs is obtained and stored in storage means; and 

5 (c) selectii^ a best-match set of HRTFs from said storage nieans such that 

6 perfimmntg said sdeding perceives audio signals processed using said best-match 

7 set ofHRTFs in the proper spatial positions. 

1 8. The mfAod of claim 7 wfarrrin datflbay nrAmd hy clugtermg said mfiafiured 

2 HRTFs, 

1 9. The method of claim 7 vvfaerein said representative subset conq>risesbet^ 

2 and2SHRTFs6ts. 
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1 1 0. The method of claim 8 whoein said database comprises S*L*2 spectra, with 

2 L - the mmiber of locations measured; and 

3 S» the number ofdiffereoce subjects measured, \dierein 

4 16<S<20a 

1 11. The method acoonlii^ to daim 8, wherein the step of matd^ 

2 match HRTF set via HRTF clustering further conqdses the steps of: 

3 (a) perfanning duster analysis cm the database of HRTF sets based nn the g imiliint !^ 

4 among tfie HRTF sets to Older the HRTF sets into a dustered structure, wfa 

5 there is defined a hi^iestlevd duster containing aU the sets of ^ 

6 database, vAierein eadi duster of HRTF sds cratains either one HRTF set, only 

7 HRTF sets which have no statistical dififaence between them, or a plurality of sub- 

8 clusto^ of HRTF sets; 

9 (b) selecting a iqnesentative HRTF set from eadi one of a plurahty of sub^ 

10 the highest level cluster of HRTF sets; 

1 1 (c) selecting a virtual target subset of HRTFs from each representative HRTF set, 

12 Mtoein each position subset ofHRTFs is assodated with a predetermu 

13 target position; 

14 (d) providing, to the user, a plurality of sound signals, eadi of said plurality of sound 

15 signals being fiherai by one of said phirality of positicm subsets of HRTFs; 

16 (e) selecting, by the user, one of said plurality of sound signals as a function of 

17 appropriate sound spatialization to said predetermined virtud target po^ 

18 sdected sound signd corresponding to the best-match cluster, wherein the 

19 rq>resentative HRTF stt of the best-matdi duster defines the best-nuttch HRTF 

20 set 

1 12. Tte m^hodacoHding to claim 11, whsein each selected repress^ 

2 is a ceolroid or popular HRTF which nK)stexenq)Ufies the simil^^ 

3 the sub-duster of HRTF s^ from which the representative HRTF set is sdected. 

1 13. The mediod according to claim 11, wherein each selected representative HRTF i^ 

2 an isolated HRTF which is most diffioentfiom the HRTF s^w^^ 

3 from wfaidi the rqmsentative HRTF set is sdected. 
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1 14. The method aoocxrding to dam 

2 best-match HRTF via HRTF clustoing further oonqirises the steps of: 

3 (a) after sdecting, by the user, one of sddpiuraUty of souml signals as a fim^ 

4 said predetenmned virtual tailgetpostticxs^sd^ 

5 eadi sub-dust^ oftfaebest-matdi duster; 

6 (b) select a subset of HRTTsfiomeadiiqnesentativeP^ 

7 of the best-matdi duster, vtoem eadi subset of HRTFs is associated with a 

8 predetermined virtual target positicm; 

9 (c) providii^ to the user, a pluraUty of sound signak, each of said plurality 

10 signals fibered widiooe of saidphnality of subsets of HRTFs ccnesponding to the 

1 1 plurality of sub-dusters of the best-matdi cluster; 

12 (d) sdectiwe one of said plurality of sound signals as a functioa of a predetermined 

13 virtual target position, the sdected sound signal coneq)onding to the best-match 

1 4 duster, i^teein the representative HRTF set of the best-match duster defines the 

15 best-match HRTF set; 

16 (e) rq)eatii^ steps a throu^d until die best-matdi cluster contaic^odyc^ 

17 or contains only HRTF sets which have no statistical difference between them. 

1 IS. A dfvi<x^ fi^ procmw'B a sign^^ co«T*"ging at least one diflimel, wherein each said 

2 at least one channel has an audio coFnpo»e"^ wherein said device processes eadi audio component 

3 such that a us^ of heac^hones can receive the processed audio component from each said at least 

4 (Hie channd and penxivedutt the sound associated with each audio com^^ 

5 ofaphiralityofpositions, said device comprising: 

6 (a) atleastonepairof digitd fitters, eadi pair of digitdfihm receiving 

7 cffnp m ent ^ ppplymg a pair of head lelated transfer fimctions fHRTFs) tp said 
g audio component, Ae HRTFs being detmnined as a fimction of a user of the 

9 headphones fiom a database of sets of HRTFs, each pair of digital filters 

10 generating a left signal and ri^ signal; 

11 (b) a first combimng circuit combining die left signals for each said at least one 

12 channel to form a left ou^ut signal; and 

13 (c) a second combining circuit combining the ri^t signals for each said at least one 

14 channd to fiomi a right output signal, the left and right output signals, 

15 to the headphones, seating a virtual listening envircnunent wherein a user of said 
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16 headphones perceives that the sound associated with each audio component has 

17 airived from one of a plurality of positions, deteraouned by said processmg. 

1 16. The device according to claim IS, further conqmsing any one 

2 (a) a bass boost drcuitcoiq)led to at least (Hie pair ofdigitalfUters, die bass 

3 circuit increasing a low fiequen^ energy of a signd input to tfa^ 

4 (b) an ear canal resonator circuit coupled to the left and rig^ ou^ut signals; and 

5 (c) a reverberation circuit coupled to at least one of said at least one channd, a fir^ 

6 output and a second output of the reverberation circuit being coupled to a 

7 respective one ofthe first and second combining circuits. 

1 1 7. A method for producing sound over headphones that is accurately spatialized for 

2 a given user ofthe headphones v^iich comprises: 

3 (a) providing said user with a control device ^ch controls a PROM programmed 

4 with a database of representative HRTFs sets amenable to selection by said user 

5 ofa best-match HRTF set; 

6 (b) transferring and stmng said best-match HRTF set to RAM linked to a DSP; and 

7 (c) processing an audio signal by said DSP using said best-match HRTF set and 

8 transmitting said processed audio signal to said user for peroq>tion. 

1 18. The method of claim 17 wherein said pmcftgging cnrnprif^ dfcnding said Signal 

2 intoaphiTBlxly of signals pri(v to usii« said bestH^^ set and, in addition to said processing 

3 using said best-nuOch HRTF set, q)tioQaUy processing ocxnponents of said plurality of signals by 

4 a method selected from die group consisting of eariy reflection processing, revoberation processing, 

5 bass boost processing, and any combination thereof 

1 19. The medKxi according to claim 18 Mtom said sde^on of s^^ 

2 set comprises transmitting sound via heac^hones to a uso* from a main processing device 

3 prpgrammed with a phirality of HRTF which are represatative of major clusters of HRTF sets 

4 in a database of HRTF sets measured from a sufficient number of individuals in die general 

5 pqpulation sudi that a statistical analysis of the measured data reveals tiluit that would be little 

6 incremental enhaiitwueut in the fidelity of sound spatialization if a greater numba* of representative 
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7 HRTF sets were used to program said processing device, and allowing the user to identify a first 

8 q)praximationof a bestnuatch HRTF set by localizing sounds in pre-determined virtual locaticms. 

1 20. The method aoconling to daim 19 wherein said database of representative HRTFs 

2 is selected from a database of measured HRTF sets, generated by measuring the individual HRTF 

3 sets of at least sixteen individuals wherein said measuring is achieved using a single robot-arm 

4 positioned sound source. 

1 21. A device foproditting sound over headphones that is accurately spatializedf(x^ 

2 given user ofthehea^yhonesidiicb comprises: 

3 (a) a peripherdcoiitn>l device whidi controls a PROM pTQgranun^ 

4 ofiepresentative HRTFs sets fiomamof^vriudi said user is able 

5 match HRTF set; and 

6 (b) a Random Access Menx)iy(RANf)residem within a main processing device 

7 is programmed with said best-matdi HRTF set. 

1 22. The device accmling to claim 21 comprising a means for wired or wirdess 

2 transmissionof sound processed by said main processing device programmed with said bestnnatch 

3 HRTF set. 

1 23. The device according to claim 22 uriierein said sound is a digital signal and said 

2 means for wirdess transmissicm is a digital jHOcessing means omiprising: 

3 (a) a filtering means far removing die DC component from said digital signal; 

4 (b) a first inverting means fi)r inverting every odier bit of said digital signal; 

5 (c) an encoding means for Qicoding a loddng bit into said digital signal. 

1 24. The device according to daim 23 iirisereinair^ow or more of the f^^^ 

2 (a) said digitd signal is a binary digital signal; 

3 (b) said filtering means is an adaptive filter; 

4 (c) said filtering means is a hi^-pass filter, 

5 (d) said first inverting means is an exdusive OR gate having as inputs said digitd 

6 signd and a digitd bit stream ccoprisingaltemating ones and zeroes (...101010...); 

7 and 
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8 (e) said encoding means is an AND gate having as input said digital signal and a 

9 repeating sequenoeof(. J 111 11... 10.. 0, wherein said AhH) gate encodes a ze^ 

10 a locking bit every n'^ b it, where n is an integer. 

1 25. The device according to daim 24 wherein said digital signal is cooq^ 

2 words, vteein said locking bit is encoded into the least significant bit locaticm of each digital word 

3 into which it is racoded. 

1 26. The device, accoiding to daim 25 viierein said locking bit is encoded into each 

2 digital word as the tominal bit of each said digital word into vAuch it is encoded. 

1 27. The device, according to claim 24 further comprising: 

2 (a) a transmitting means for transmitting said digital signal; and 

3 (b) a receiving means for receiving said digital signal. 

1 28. The device, according to claim 27, wherein said receiving means con^siises: 

2 (a) a first locking means for lodcing onto the bit rate of said received digital signal; 

3 (b) a second locking means for lockii^ onto the lodcing bit of said recdved digital 

4 signal; and 

5 (c) a second inverting means for inverting said previously inverted bits. 

1 29. The device, accoiding to claim 28, vrfierdnaiQr or aU of the foUowingappl^^ 

2 (a) said first k)ckingnieans is a phase lodced loop; 

3 (b) said seocmdloddng means is a state madiine; and 

4 (c) said transniittiiig and recdviiigaieans are wirdess. 

1 30. A device for rapidly and accurately goorating a database of HRTF sets based on 

2 measurements firom a large immba* of individuals comprising: 

3 (a) a single, robot«ann positicmed sound source; 

4 (b) a robot-arm for positioning said single sound source; 

5 (c) a measuienttot control syston; and 

6 (d) transducers for measurii^ sound and distortions thereof as it is reed ved at each ear 

7 of an individual whose HRTF s^ are being measured, after bdng ^nerated by 
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g said single sound source at various locations about the individual wearing said 

9 transducers. 

1 31. The device of daim 30 vrfierein said transducers are positioned at the entrance of the 

2 outer ear canal of the individual whose HRTF sets are being measured. 

1 32. A device for q^atializing sound over headphones comprises: 

2 (a) a means for storing a representative set of HRTFs selected from a 

3 database of measured HRTFs; 

4 (b) a means f(M: a user to select a set of HRTFs fircm said means fcv* storing 

5 said representative set of HRTFs; and 

6 (c) a meaiis for piooessii^ audio signals using said set of HRTFs selected by 

7 the user such that the user peitdves the ccmespondi^g sounds to be 

8 localized on the proper spatial positions; 

9 wherein said database of measured HRTFs comprises S^L*2 spectra, with 

10 L ° the immber of locati(His measured, and 

U S = the number ofdi£Gsrence subjects measured, \^iiere^ 

12 16<S<200. 

1 33. The inethod according to claim 17 wherein said signal is a digital signal an^ 

2 transmitting comprises: 

3 (a) removing the DC component of said digital signal if present; 

4 (b) inverting every other bit of said digital signal; and 

5 (c) encoding a loddng bit into said digital signal. 

1 34. Ihemediodaoconling to daim 33, wherein any one or nioie at the foUo^ 

2 (a) said digital signal is a binary digital signal; 

3 (b) said removing of said DC component is adiieved by adaptive filterinig; 

4 (c) said removing of said DC componoit is achieved by hi^-pass filtering; 

5 (d) said inverting of every odier bit of said digital signal is accomplished by exclusive 

6 ORing said digital signal with a digital bit steam comprising alternating ones and 

7 2erx)es (...101010...); 
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8 (e) said encoding of a loddng bit into said digital signal is achieved by oicoding said 

9 locking bit in a certain bit location of eveiy n^ word comprising said digital signal, 

10 wherdn n is an intego*; and 

11 (0 said encoding of a locking bit into said digital signal is achieved by encoding said 

1 2 loddqg bit at every ri^ bit of said signal wherein said locking bit is always a one (h* 

13 always a zero atKi >rfierem n is an integer. 

1 35. The m^od» according to claim 17, fuith^conqirisingt^ 

2 (a) transmitting said digital signal; and 

3 (b) receiving said digital signal to produce a received digital signal. 

1 36. The method according to claim 35, Mtoein said receiving step comprises: 

2 (a) locking onto the bit rate of said received digital signal; 

3 (b) loddng onto the locking bit of said received digital signal; and 

4 (c) inverting the previously inverted bits. 

1 37. The method, according to claim 36, Mtoein any ooe or mcve 

2 (a) safai locking cmto said bit rate of said received digital signal is accomplished by a 

3 phase locked loop; and 

4 (b) said locking onto said locking bit of said received digital signal is aocQnq>lished 

5 widi a state nuchine. 

1 38. A storage ncans encoded widi a database of HRTFs such Aat HRTFs apimyriatc 

2 for a particular individual may be retrieved frcmi such storage means to a^ as a filter in digital 

3 processing of an audio signal transmitted to headphmes for accurate sound spatializatkm. 
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